Search This Blog

Tuesday, April 6, 2010

Converting Files of One Encoding To Another Encoding

My design for the SCJD required the use of a FilePropertiesManager for writing and reading application specific properties.
My implementation included a wrapped Properties object that was then read from and written to. The Properties file is "ISO-8859-1" encoded.
One of the step in creating a JUnit test case for this class was to create all the necessary support functions. One such function was to read an "ISO-8859-1" encoded properties file and write that properties file back into one for the many supported encodings.


Here's the test method:




Here's the method:

Saturday, February 27, 2010

Variables and Identifiers

Variables and Identifiers: what’s the diff?

Some ridiculously wise guy once said, “A house is only as strong as the foundation on which it is built”. That same principle can be applied to a Java developer: A Java developer is only as good as his knowledge of basic Java fundamentals. So if your seeking some fundamental Java truth read on because this months topic is sure to build on those foundations as we will attempt to clarify the difference between a variable and an identifier.
I was often confused by this ‘concept’ (actually, I didn’t know the difference) and treated them both as one and the same because in general topics discussing Java, no distinction is made between variables and identifiers, although in some case, as will be discussed shortly, this is certainly true with primitives. Why this distinction is not rendered in some of the more basic Java topics is in my humble opinion because it doesn’t take understanding byte-code to implement ‘HelloWorld’. But as we approach more advanced topics in Java this distinction between a variable and identifier becomes quite important.
The difference between languages that implement data types for runtime objects (such as Java) and those that only use data types at compile time to generate runtime executable code to manipulate those objects (such as C/C++) is largely based in the distinction between identifiers and variables regardless of the similarities of the syntax shared between these two languages. One main distinction is that objects are handled quite differently at runtime. This characteristic difference impacts how objects are manipulated in the language in general. For a Java developer, understanding this difference and specifically, in terms of how java treats objects, will aid in the understanding of various concepts throughout Java such as when objects are being passed between other objects or around a network using tools such as RMI, or the concept of an interface. Both these concepts and more, succinctly become quite clear to the programmer upon the understanding of the nature of objects at compile time and runtime.
An identifier is the name and associated data-type that identifies a variable in a Java source file or more simply, the name of the variable that is in the source code for the program.
For instance:
int aPrimitiveInt
where, aPrimitiveInt is the name and int is the data type.
A variable, on the other hand, is the actual instantiation (the memory storage of data values being manipulated in the program) of that data-type at runtime or simply the actual memory that is allocated at runtime.
For instance:
Int aPrimitiveInt = 24;
where, 24 is the variable, or in this case, the actual contents stored in memory.
For clarity sake, a data-type is a set of data values and operations on those data values.
As a ‘newbie’ to Java I held a rather fuzzy understanding of this difference between a variable and an identifier and hence confused the definitions and concepts and referred to both as a variable. As it stands, I was half right: in the case for primitives (int, float, double, etc) the difference between the identifier and variable is irrelevant.
…well, sort of.
Java implements two mechanisms to ensure type matching between compile time identifiers and runtime variables: one for objects, which is by ‘design’, and one for primitives, which is not so much a mechanism by design but more or less a mechanism as a result of design. The specifics resulting in the mechanism for primitives are beyond the scope of this article and will not be discussed. However, the mechanism itself will.
This specific mechanism is that the data type of the variable referenced by the identifiers for primitives cannot be changed in Java. In other words, if a variable of a different data type is used in an assignment statement for a primitive, it always involves copying the variable and converting the copy to the correct type during compilation.
Primitive data types are built directly into the Java virtual machine and therefore share the same representations and operations for primitive data types as the Java compiler. This means that the compiler can generate the code that manipulates these variables and once the compilation is complete, their data types can be safely forgotten. So therefore, the programmer knows that the data type of the identifier is the same data type of the variable it references.
Below is a code fragment to help visualize this concept:
int i = 1.234f;
Here, we have an int and we want to set it to a float variable. During compilation, since the data type of the variable referenced by the identifier cannot be changed (the int cannot be changed to a float – because java says so!) the float will get copied into another variable with a data type of int and hence, the reference (1.234f) cannot change the data type (i) it (the reference) represents. So, therefore, while it is wrong to say that the identifier and the variable are the same, making the distinction for primitive data types can be ignored.
Well, I hope you all now have more of a foundational understanding the Java language and specifically of the mechanism inherent to Java concerning type matching between compile time identifiers and runtime variables for primitives. Remember, the distinction itself between an identifier (the name and data type it represents) and it’s associated variable (actual instantiation of the data at runtime) for primitives can be ignored as Java does not allow the data type of the variable referenced by the identifiers for primitives to be changed. For those that did not know this distinction and thought they were one and the same you now know different and can continue ‘not knowing’ as you, in a way, kind of ‘now knowing’, are now in a sense, a wise guy.
Next time, we’ll talk about the other mechanism (which is a ‘design’ in Java) to ensure that objects are type safe in Java: the runtime data type tag.

Inner Classes and Shtuff

Inner Classes




By: Troy D. Travis
CMSC335






























Table of Contents



I. Introduction
II. History and Basic Facts of Inner Classes in Java
III. 4 Tyes of Inner Classes
a. Nested Inner Classes
b. Inner Classes
c. Local Inner Classes
d. Anonymous Inner Classes
IV. Advantages of using Inner Classes
V. Inner Class Uses and Strategies
a. Callbacks
b. Inner Class Strategy, Data Access Objects
VI. References































Introduction

Inner classes are a very useful concept in the java language. They can help to keep code cleaner, they can help to control access to objects and they provide a certain since of robustness in code that aids in object oriented design.
This paper will address the history of the inner class as it figures into the Java language. We will also discuss the various kinds of inner classes, their names, how many and what types of files are generated by the compiler, access restrictions and why the restrictions are what they are. We will then discuss some examples and purposes for inner classes in the development world.

History and Basic Facts of Inner Classes in Java

Up until Jdk 1.1, the java language only supported top-level classes. At the introduction of jdk1.1 Sun introduced the concept of nested/inner classes in java. This allowed a developer to use inner classes as some sort of swiss army knife—they can be used to do many different things, and can be used to reproduce some C++ features that are otherwise unavailable in Java. This java feature of inner classes is limited to the compiler. To compile code that contains inner classes you will need jdk1.1 or later, BUT because of the way the class files are generated during compilation, the compiled code can be executed on any jvm. The naming of anonymous inner classes varies from one compiler implementation to another. Sun uses numbers to provide anonymous class names. For instance, if a top-level class was named Car and one of the methods contained an anonymous inner class, that class might be named Car$1.class by the compiler.

As far as the java interpreter is concerned, there is no such thing as an inner class: all classes are normal top-level classes. There are three types of inner classes and will be discussed shortly but they are: member classes – static and non-static or ‘nested inner’ and just plain ‘inner’ classes. To make this ‘magic’ work the compile performs various ‘tricks’ that can be viewed by using the javap disassembler on the generated class files. Once disassembled it can be seen that the compiler actually inserts hidden fields, methods, and constructor arguments into the classes it generates. This information will be used to help explain the various behaviors by each type of inner class. It should also be known that a file can only contain one outer public class. Any attempt to create a file with more than one public class will cause the compiler to yell. A file can contain multiple non-public classes (inner classes) though.


Four Types of Inner Classes

When referring to inner classes we will make a reference to the containing class. This containing class is usually referred to as a/the top-level class. A top-level class is a class that is defined as a member of a package. In general, inner classes make it easier for a developer to connect objects together, since classes can be defined closer to other objects. They are classes defined from within the same java file. They are ‘contained’ inside a ‘container’ class or a ‘top-level’ class.

There are three types of inner classes. But, if we consider a static inner class then there are four. The first type of inner class we will discuss is the nested inner class or static member class (because it is a member of the top-level class). We will then go on to discuss the inner classes or plain old member classes of which these classes are instance classes of the top-level class in which it is contained. We will then discuss local inner classes, which are defined inside of a block of code or more generally a method perhaps. Finally, we will briefly discuss the anonymous inner class.


Nested Inner Classes (static member classes)
Nested inner classes are classes within classes. A nested inner class has the same behavior as any static member of a class. You can access it without initializing the parent class, and they can access the parent class’ static methods and variables, even if the variable is private. These types of inner classes are always defined with the keyword ‘static’. An access tag (public, protected, private) can be defined, but by default a nested class takes the default package access. This produces an interesting question: if the effect of marking it as static means there is only one instance of any variables, no matter how many instances of the outer class are created, how could the static inner class know which variables to access of its non-static outer class? The answer is obvious and it is that it could not know, which is why a static inner class cannot access instance variables of its enclosing class. Sun considers nested classes to be top-level classes. Since the static member class is compiled into an ordinary top-level class, however, there is no way it can directly access the private members of its container. As mentioned above in the history section, the compiler will generate extra code. This makes since, because if a static member class uses a private member of its containing class (or vice versa), the compiler automatically generates non-private access methods and converts the expressions that access the private members into expressions that access these specially generated methods. These generated methods are given the default package access, which is sufficient, as the member class and its containing class are guaranteed to be in the same package. Compiled nested inner classes are named as ‘OuterClassName$InnerClassName.class’. Where OuterClassName is the container class (or package) and InnerClassName is the name of the static nested inner class defined inside of the containing class. The $ in this name is automatically inserted by the compiler. So if you were to create a class OuterClass.java and within that class a class called InnerClass the compiler will generate two class files. This is why inner classes can be run from any jvm (as mentioned above). Since, by definition, a nested inner class is static it can be accessed from anywhere in your code by using OuterClass.InnerClass. The same applies to writing import statements.


Inner Classes (member classes)
These ‘member classes’ are implemented much like a static member class. They get compiled into separate top-level class files and the compiler performs various code ‘tweeks’ to get it to work. Inner classes differ from a nested class because they do not have a static tag to them. Therefore, every instance of an inner class requires an instance of the container class and the container class can have multiple instances of the contained class/inner class. The compiler enforces this association by defining a synthetic field named this$0 in each member class. This field is used to hold a reference to the enclosing instance. Every inner class (non-static) constructor is given an extra parameter that initializes this field.
A non-static inner class can be declared public, protected, or private or given the default package visibility, but as mentioned previously, member classes are compiled to top-level class files but top-level classes can only have public or package access, which means member classes can only have public or package visibility (to the Java interpreter). So if a member class is declared as protected it is actually treated as a public class and if it is declared as private it actually has package visibility. According too anonymous, “Although the interpreter cannot enforce these access control modifiers, the modifiers are noted in the class file. This allows any conforming Java compiler to enforce the access modifiers and prevent the member classes from being accessed in unintended ways.” Just a thought but can you deduce how this relates in a security centric way?


Local Inner Classes
These types of inner classes are defined inside a bloc of code, such as a method and are not available outside of the method in which it is defined, even inside class outer. However, like member classes, a local class is able to refer to fields and methods in its containing class for exactly the same reason and mechanism available to a member class; it is passed a hidden reference to the containing class in its constructor and saves that reference away in a private field added by the compiler. Also, the private fields and methods of the container class are also available to local inner classes because the compiler will insert any required accessor methods. What separates a local class from a member class is the ability to refer to local variables in the scope that defines them. There is, hence, one restriction to this: local classes can only reference local variables and parameters that are declared final. According to the Java spec, this is because the compiler automatically gives the local class a private instance field to hold a copy of each local variable the class uses and as previously mentioned the compiler will add special hidden parameters to each local class constructor to initialize the compiler created private fields. Therefore a local class isn’t actually accessing local variables but merely copies, albeit private copies, of them. And to make this work a guarantee must be in place that says these values won’t change and that is by using the keyword final. Almost magic!


Anonymous Inner Classes
An anonymous inner class usually does not have a name. They are often used in gui development to implement listeners. An anonymous class must either extend a class or implement an interface but it can’t do both like a member class. An access modifier is never specified and cannot be specified. Since the class doesn’t have a name it doesn’t get a constructor.

By now the reader should have a pretty good understanding of inner classes and how they are created via the compiler and how they are interpreted by the interpreter and how and why their access behavior exist. Next we will touch on some basic advantages to using inner classes.


Advantages of Using Inner Classes

There are quite a few advantages to using inner classes. These advantages include, but not limited to, blocs of code, integration of objects in objects and enabling a more object-oriented structure.

Blocs of Code
Nested inner classes give developers the ability to implement blocs of code that can be used to realize special functions inside a given object. Using nested inner classes a developer can integrate blocs of code inside an object to perform special tasks pretty much like method pointers in C.

Integration of Objects in Objects
By giving developers the ability to integrate blocs of code inside other objects via an inner class, a developer now has the ability to integrate objects in objects which can lead to a more object oriented design. A developer can create objects that are specific to the enclosing class. This makes the code easier to read and to maintain and leads to a more object oriented design.

More Object Oriented Structure
By allowing developers to define inner objects top-level nested classes can be defined inside another object. Multiple class files do not need to be created for every distinct object in the program. Objects can be integrated into one another, which makes the code easier to understand and makes since, from an object oriented perspective, to have classes only used by a certain class inside of that class.


Inner Class Uses and Strategies

There are various uses and many strategies implemented that take advantage of the powerful nature of inner class design. We will present two of these approaches. The first approach is in the implementation of ‘callbacks’. Java doesn’t necessarily use ‘callbacks’ like C, but we can certainly imitate them and is clearly seen in the AWT framework. The other approach is merely a design strategy used among various persistence frameworks.


Callbacks
A common technique in the C language is to pass a pointer to a function in the argument list of another function. The receiving function can invoke the passed in function using the pointer. This approach is referred to as a callback. Callbacks are very useful in situations where you want to invoke different functions without needing to know what particular function is being invoked. For example, a plotting function could receive a pointer to a function that takes a value on the horizontal axis as an argument and returns a value for the vertical axis. The plotting function could then plot any such function that is passed to it. Inner classes are very useful when you want to implement callbacks. Java, however, does not provide for pointers (direct memory addresses) to methods. Instead, interfaces are used for callbacks. In this case, a method holds an interface reference in its argument list and then invokes a method in the interface. A callback method is a method intended to be passed as a parameter. The callee can then call the passed method asynchronously and with whatever parameters it deems appropriate. One of the basic techniques used in java to fake a call back is to use inner classes to define an anonymous callback class, instantiate an anonymous callback delegate object, and pass it as a parameter all in one line. This allows a developer to efficiently have a separate bloc of code to handle the actionPeformed function of the Actionlistener interface for every graphical component object. This is one of the many features found in other languages that the java language has been able to implement.


Inner Class Strategy, Data Access Objects (DAO)’s
Let’s say a developer wants to isolate data access and manipulation in a separate layer, while also keeping member variables private and hence retaining full encapsulation and object integrity. How do we handle this?
A Data Access Object (DAO) is found when dealing with object-relational persistence. The DAO is the primary object of the ‘Data Access Object’, one of the core j2EE patterns. The Data Access Object abstracts the underlying data access implementation for the BusinessObject to enable transparent access to the data source. There is often a need for the DAO implementation to modify data on the domain object (Business Object) that is being persisted. For instance, when a new object is being inserted into a relational database table, it can sometimes be auto-assigned a surrogate primary key as opposed to a natural primary key. In a properly designed entity class, the field representing the primary key should not be modifiable to the public, as this would compromise the integrity of the object’s state. A solution to this is to create the DAO interface as an abstract static inner class of the domain object it persists. This is a viable solution because inner classes have access to the private members of the outer class. This allows the DAO implementation to have access to select private members, while still isolating the actual persistence behavior in its own class. For instance, the Customer class has three private members: name, customerNumber(natural primary key) and customerId( auto generated surrogate primary key). We want to make sure the customerId can not be changed outside of the Customer class (the business object). So to satisfy this requirement we create the CustomerDAO as an abstract inner class of the Customer class. In this example, the customerId is available only as a readable property to the public while retaining encapsulation and guaranteeing no change while the DAO implementation is allowed to modify the attribute.

In this paper we have discussed inner classes as they are applied in java. We have discussed the history and basic ideas behind inner classes. We have also discussed the various types of inner classes, the nested-static inner class the non-static inner class, the local inner class and the anonymous inner class. We discussed how inner classes are invoked, their scope and how they relate to the java compiler and interpretor. We also discussed some concepts and design practices that currently use the inner class design, such as the implementation of ‘callbacks’ and it’s use in persistence frameworks.












































References

http://mindprod.com/jgloss/callback.html
http://www.javaworld.com/javaworld/javatips/jw-javatip10.htmll
http://java.about.com/cs/javadocumentation/g/callback.html
http://www.particle.kth.se/~lindsey/JavaCourse/Book/Part1/Java/Chapter04/interface.html
http://nils.kilden-pedersen.net/DAO_pattern/InnerClassStrategy.html
http://www.unix.org.ua/orelly/java-ent/jnuy/ch03_13.htn

Dynamically Calculating Primary Keys

Oracles’ Application Development Framework business Components (ADF BC ) allows for the implementation of business rules for adding default values to entity attributes, but what type of rules can we implement for the case of a sequence generated primary key?
It is standard practice within current industry practice to define a primary key column on all object relational database tables, which in most cases is a unique number (as opposed to a unique string of characters). This unique number can be generated by a database sequence number generator or by calculating the next available value based on what is currently realized on that column. The latter presents obvious transactional problems whereas the previous is the standard way of handling unique number-based primary keys. This presents an interesting challenge when implementing business logic, which involves inserting new record/records into a database. If we create a new Row, populate the various attributes (except for the primary key) and attempt to save the transaction we will most assuredly receive a database error, as a primary key is always unique and non- null. To prevent this type of error we can implement this logic in one of two ways:
• java code (dynamically calculated default values),
• or we can use ADF BC provided class, oracle.job.Domain.DBSequence type in tandem with a database trigger
There are many approaches to dynamically calculating the sequence value with java code. I will describe two approaches, one is how I implemented the solution on one project and the second is a solution recommended by Oracle and is which I currently practice.
The java based solution I implemented, involved creating various private methods on the ApplicationModuleImpl: getNextSequence() – which called a pl/sql stored procedure to return the next available sequence, and createRecord() – which creates a new row, populates the various attributes (including a call to getNextSequence() for the primary key) and inserts the row back into the ViewObject, which ultimately inserts into the database (for the current transaction upon post changes) and in the database once commit is called.
The Oracle solution is far more eloquent and utilizes the existing ADF BC library.
The EntityImpl class contains a protected method, create(), which can be overridden to set attribute defaults or in this case, dynamically assign a database sequence number to the primary key. create() is called whenever the entity object instance is created, i.e. a new record is created from the ViewObject. In the create() method, just after the call to super.create() we use the oracle.jbo.server.SequenceImpl class which wraps Oracle database sequences. To instantiate a SequenceImpl, we need the current DBTransaction, which can be obtained through a call to the getDBTransaction() method (conveniently found in the EntityImpl class) and the db sequence name. Once we instantiate an instance of SequenceImpl we can get the next sequence and ‘automatically’ set the attribute representing the primary key immediately upon Row creation. Two obvious observations and potential issues from both ‘java code’ solutions can be asserted here: we had to add extra java code and the possibility of using up sequence numbers when the current transaction must be rolled back. I’m not saying that adding java code is necessarily ‘bad’, as certain requirements warrant this type of solution, but we certainly don’t want to use up our sequence numbers as it helps to keep our sequence consecutive.

The second way (and certainly not final) to generate a Primary Key using DB sequences is to create a trigger in the database to update the value of the database column on which the attribute is based. If a trigger is indeed created to handle the creation of the primary key, than the data-type for the entity attribute representing the primary key should be set to DBSequence. DBSequence maintains a temporary unique value in the entity cache until the data is posted at which time a new sequence value will be generated. In this way whenever a new record is created we do not have to worry about populating the primary key attribute via java code inside the service methods and/or use up our sequences. Of note, any field that uses DBSequence, except on rows that have been posted, should not be made visible to the user as the temporary value inserted into the cache has no relationship to the value that the database column will contain.
So there you have it, two options of an ADF BC general business rule for implementing the insertion of a sequence generated primary key value into newly created rows: dynamically calculating a value using java code inside ADF BC libraries or by using a DML enabled database trigger and a DBSequence as the attribute type.
Next month, we’ll talk about adding default values to entity attributes other than primary keys.

Case Study: Oracles’ JDeveloper IDE and its Use for Data-migration.

Case Study: Oracles’ JDeveloper IDE and its Use for Data-migration.

This case study attempts to enlighten readers of JDevelopers’ powerful BC4J technology and how it was used to create a Java application to perform data migration. This is by no means an exhaustive context nor does it imply that the implementation described herein is the only way to implement such an application. We write this case study merely as an example to describe common problems, solutions and technology explanations to aid readers with their data-migration ventures.
The following topics will be discussed as they all relate to data migration:
The design pattern implemented in the application for code maintainability,
Where, when and why to write custom code in the ADF generated java classes,
Solutions to some perhaps common problems encountered before and during implementation of the application,
Unit testing and a simple design pattern that, as will be seen, will make testing relatively easy and quite maintainable,
Exception handling.

On my most recent project, I was tasked to perform a data-migration between one database (specify) and another (specify). Contract requirements, by which my team and me were bound, required us to use the JDeveloper IDE development tool (Although, Eclipse was preferred and we will describe ways in which we used Eclipse). This proved to be a major advantage over other open-source technologies (specify) mainly due to ADF’s Business Components for Java (BC4J) technology as available for use from within JDeveloper (specify other possible ways to use BC4J without JDeveloper). Because of BC4J we can create accurate object relational mappings of database tables as an entity object (what is an ‘entity’ object). These entity objects can be used in such a way as to create one or more view objects based on one or more entity objects (or even no entity object in the case of a transient view object – to be discussed in a later article). In order to access the data model the view objects must be grouped into what is called an Application Module (an application requiring database connectivity, as developed by JDeveloper, must have 1 or more Application Modules). One design pattern is to create nested application modules where an Application Module is chosen to be the root module and other application modules are then nested within the root module. Another design pattern, as used in this case study, is to have separate application modules, not nested, where transactional support is not necessary or needed.
The required task was to migrate data from one applications data-model to another applications slightly similar data-model. Both applications served the same purpose, which was data tracking of PDUFA related meetings. They were similar in that they both tracked meetings. The source application tracked the history of the meeting and related drug applications and the destination application tracked the current status of the meeting as well as the associated drug application sponsors. As with any project the requirements gathering process is performed first-thing in the projects life cycle. The requirements are certainly needed yet they are not always well elaborated or a simple statement makes up the whole requirement. Such as was the case for this data migration. The stated requirement for this task was almost literally “… migrate database A to database B…”. Well, as can be seen, this picture didn’t represent a thousand words. So, the requirements had to be gathered and defined. The requirements gathering process was difficult at first. The existing schemas made available to me were not accurate. For instance, tables were missing from the source database, specifically the main table. There were tables in both schemas that were not being used in production at all which required determination of which tables to use. Access to both schemas is needed. Point of contacts(POC’s) for each schema is needed. You’ll need POC’s for both the source and destination applications, they can direct you to documentation for the application. You’ll need POC’s of someone that represent the users of the applications. You’ll also need POC’s for the databases themselves. There are several components required for a successful migration. You’ll need to understand the purpose of the source application as well as the purpose of the destination application, which in this case study the purpose of the source and destination application are the same. This knowledge can be obtained from the applications user guide, instruction manual and related media (hopefully documentation was generated). You’ll need access to both schemas including read/write access. You’ll need data dictionaries for both schemas. If these do not exist or cannot be found readily, then the data dictionary must be created through research, as was the case here. This can take significant time. The data dictionary is important as it gives you the data types, size, not-null constraints and the meaning and/or possible values for each column. This information is required to have and create correct mappings between the datasources. When building the mapping definitions keep in mind which columns the source database are required and which are not (can be null). Some columns will not be mapped and should be noted. Some columns from the source will be dropped. Some columns will have a direct mapping, whereas for other columns, the relationship will be derived. This derivation could be based on one or more columns and/or separate schemas. The derived mapping requirement should consist of an algorithm detailing how the column is to be populated based on the source data. This is usually found by talking to the POC ‘s. Again, keep in mind that the size of the source cannot be smaller than the destination column and you must compensate for data type differences. A source value larger than the destination column or inserting the wrong type of data will cause an Exception to be thrown during the migration. A separate schema can be created for the purpose of translation. In other words, lets say that a specific source column maps directly to a destination column and that the only differences are that the possible values contained in these columns have the same meaning but are not exactly the same (different words, same meaning). A ‘middle’ table or ‘look-up’ table can be used in this case. Where one column would contain the possible source values and the other column would be the direct mapping to the possible values in the source table. Once the mapping is complete and the client is in agreement on what has been defined the requirements gathering process is complete and coding can begin.
You will need to create application modules for each data source (connection). This is the beauty of using Oracles ADF BC4J technology. Persistence is easy. Remember, that there exists a project level defined connection. So when you are creating your view objects and you cannot see the schema during the create-wizard process you must change the default database at the project level. My application made use of four application modules: source, destination, middle (translation and exceptions tables) and an outside schema that was used for translation purposes. Each of these will contain the prospective view objects pertaining to its relative table.
The source application module contained about 7 tables. The destination application module contained view objects of 2 tables; 1 parent table and 1 child related table. The middle application module contained view objects of 2 translation tables, copies of the destination tables and a migration-linking table.
The two translation tables were used as a mapping between the source and destination columns that had the same meaning but different representation of the data.
The tables representing the copies of the source tables are used to store the records being migrated and any status associated with them of which all constraints are stripped and no parent/child relationship exists. These tables are important to the migration as it gives an accurate account for each migrated record: whether or not it was successfully migrated. It can also aid in the manual clean-up work that inevitably accompanies migrations.
The linking table has two main purposes. The first is that both the source and destination schemas contain a primary key labeled as meeting id and are each generated by a sequence number generator. In order to maintain a link between the two schemas we store the mapping of the migrated records, which consists of the meeting id of the source table and the newly generated meeting id for the destination table. The other purpose of this table is for the initialization methods found in the main class and are used to clear out the newly migrated data when performing back-to-back migrations for testing purposes.
There exist two distinct design patterns found in this effort. The first is at the application level and the second is found in the unit testing level.
At the application level the design pattern I used for this application consisted of a single translator class, a utility class, and two bean classes. View objects data bound to the source, destination and middle tables were contained in their respective application module. And of course we had a main class for executing the application.
The translator class is used as a medium for translating source values into destination values where each method implements a mapping algorithm based on the migration requirements. The translator class includes, as members, the Application Modules needed to perform the translations. The utility class contains methods for performing various repeating tasks and I pulled private methods from out of the translator class and placed into the utility class as well, which allows for unit testing of private classes. The bean classes implemented temporary storage for the translated source data, and contained getters and setters for each column value of the destination tables. Validation was also implemented which tested for values that required a not-null value based on not-null column constraints as specified during the requirements gathering process. Also included in the bean classes was a migration status variable whose purpose is to record the status and any message associated with the migration of the record, which could range in status from a successful migration to some caught exception. We could have also used Transient View Objects, which I will explain shortly. I did not implement the use of transient view objects as a means for temporary storage but it is the recommended way.
The design pattern for the unit testing consisted of connect fixtures for each application module, unit tests for the translator class, unit tests for the utility class and unit tests for the instance methods in the main class.
The connect fixtures initializes each application module as they are used in the unit tests and are fairly easy to create (show example). I created unit test classes for each translator method. I could have just created one class with a bunch of test methods but it would have gotten fairly unwieldy and the single class/method pattern makes it easier to test/debug one method at a time. I created a test suite to run all tests at once. I created one unit test for the utility class’s methods. The unit tests for the instance methods in the main class require further discussion. I created create, update and delete (CRUD) methods for each destination table. These methods were used in the initialization test methods. The initialization methods, which enable us to be able to run multiple migrations without creating duplicates, delete all records in the destination tables based on the records found in the middle table used to link the source records to the newly created destination records. The initializations are performed in the following order: the child tables are cleared followed by the parent tables and if no exception occurs we commit the transaction. Then we delete all records in the exceptions tables and in the middle table we used to link the source to the destination table. To test the initialization method we create records in each source table and run the initialization method. We test for cases where the destination table contain no records and for case where they do contain records. The migration shouldn’t continue if the initialization fails. Once initialization is successful we can continue with the migration. The selection query for the source data is based upon the requirements. We use a view object and iterate through each record. We use the meeting id to create the beans of the child records and associated parent records from each record factory, the record factories contain the translator classes, then we take each bean and insert them into the destination view objects. We call validate on each record and call commit. If transaction is successful we insert a new linked record and we insert into the exceptions tables. So we have a migrate method, insertIMTSRecord, insertIMTSSponsorRecord, insertlinkedRecord, insertSponsorExceptions, insertImtsExceptions and, as mentioned previously, the initialization methods. The initialization methods reference delete methods for each table. The test classes would then contain test methods for the insert and delete methods. The migrate method is the only method that should have public access. So how do we test the other private methods?

Validation for column values that cannot be null. By catching Exception (validating the row) or through temp bean.

For each migration candidate insert into middle table, with no constraints, including a status column of the migrated record.

Initialize methods for clearing out destination tables by using the source key to dest key mapped table.

Middle table containing source key and it’s mapped destination key.

Temporary sequence numbers, instead of using up sequences.

Inserting parent/child records what may cause errors and how to avoid them.

Thursday, July 23, 2009

Java Time and UTC

Java does have a date type that is truly time zone ignorant: long. It is the time in milliseconds since January 1, 1970 UTC. It is the value you get when you say System.currentTimeMillis. You can convert that for display, to local time, server time, UTC, or any other time zone you want, when you want to display it. java.util.Date and java.sql.Date and java.sql.Time all wrap long, and they contain no time zone information at all. If you call toString(), it will format the time into a string using the local time zone, but toString() is intended primarily as a diagnostic tool. Most other methods on those classes are deprecated because date/time localization is not a trivial issue: they don't really work. java.util.Calendar is currently the preferred way to deal with date/time localization in Java. Use java.text.DateFormat for more flexible capabilities (it uses java.util.Calendar internally to do its work).

Friday, July 10, 2009

When Windows Explorer Acts Buggy

Recently, at work, my Windows XP machine started acting very strange: Windows Explorer took forever to list directories and appeared to be freezing up on occasion. When, after several re-boots the problem remained persistent I consulted with my Google friend.

The solution was relatively simple.

delete C:\Documents and Settings\\Local Settings\Application Data\Microsoft\Windows

(or you could simply rename it if you don't like the word delete)

However, there is one caveat: you need to sign in as a different user with local-admin rights in order to delete this file because you can't delete this file as the current user.

Hope this helps someone.