Search This Blog

Saturday, February 27, 2010

Variables and Identifiers

Variables and Identifiers: what’s the diff?

Some ridiculously wise guy once said, “A house is only as strong as the foundation on which it is built”. That same principle can be applied to a Java developer: A Java developer is only as good as his knowledge of basic Java fundamentals. So if your seeking some fundamental Java truth read on because this months topic is sure to build on those foundations as we will attempt to clarify the difference between a variable and an identifier.
I was often confused by this ‘concept’ (actually, I didn’t know the difference) and treated them both as one and the same because in general topics discussing Java, no distinction is made between variables and identifiers, although in some case, as will be discussed shortly, this is certainly true with primitives. Why this distinction is not rendered in some of the more basic Java topics is in my humble opinion because it doesn’t take understanding byte-code to implement ‘HelloWorld’. But as we approach more advanced topics in Java this distinction between a variable and identifier becomes quite important.
The difference between languages that implement data types for runtime objects (such as Java) and those that only use data types at compile time to generate runtime executable code to manipulate those objects (such as C/C++) is largely based in the distinction between identifiers and variables regardless of the similarities of the syntax shared between these two languages. One main distinction is that objects are handled quite differently at runtime. This characteristic difference impacts how objects are manipulated in the language in general. For a Java developer, understanding this difference and specifically, in terms of how java treats objects, will aid in the understanding of various concepts throughout Java such as when objects are being passed between other objects or around a network using tools such as RMI, or the concept of an interface. Both these concepts and more, succinctly become quite clear to the programmer upon the understanding of the nature of objects at compile time and runtime.
An identifier is the name and associated data-type that identifies a variable in a Java source file or more simply, the name of the variable that is in the source code for the program.
For instance:
int aPrimitiveInt
where, aPrimitiveInt is the name and int is the data type.
A variable, on the other hand, is the actual instantiation (the memory storage of data values being manipulated in the program) of that data-type at runtime or simply the actual memory that is allocated at runtime.
For instance:
Int aPrimitiveInt = 24;
where, 24 is the variable, or in this case, the actual contents stored in memory.
For clarity sake, a data-type is a set of data values and operations on those data values.
As a ‘newbie’ to Java I held a rather fuzzy understanding of this difference between a variable and an identifier and hence confused the definitions and concepts and referred to both as a variable. As it stands, I was half right: in the case for primitives (int, float, double, etc) the difference between the identifier and variable is irrelevant.
…well, sort of.
Java implements two mechanisms to ensure type matching between compile time identifiers and runtime variables: one for objects, which is by ‘design’, and one for primitives, which is not so much a mechanism by design but more or less a mechanism as a result of design. The specifics resulting in the mechanism for primitives are beyond the scope of this article and will not be discussed. However, the mechanism itself will.
This specific mechanism is that the data type of the variable referenced by the identifiers for primitives cannot be changed in Java. In other words, if a variable of a different data type is used in an assignment statement for a primitive, it always involves copying the variable and converting the copy to the correct type during compilation.
Primitive data types are built directly into the Java virtual machine and therefore share the same representations and operations for primitive data types as the Java compiler. This means that the compiler can generate the code that manipulates these variables and once the compilation is complete, their data types can be safely forgotten. So therefore, the programmer knows that the data type of the identifier is the same data type of the variable it references.
Below is a code fragment to help visualize this concept:
int i = 1.234f;
Here, we have an int and we want to set it to a float variable. During compilation, since the data type of the variable referenced by the identifier cannot be changed (the int cannot be changed to a float – because java says so!) the float will get copied into another variable with a data type of int and hence, the reference (1.234f) cannot change the data type (i) it (the reference) represents. So, therefore, while it is wrong to say that the identifier and the variable are the same, making the distinction for primitive data types can be ignored.
Well, I hope you all now have more of a foundational understanding the Java language and specifically of the mechanism inherent to Java concerning type matching between compile time identifiers and runtime variables for primitives. Remember, the distinction itself between an identifier (the name and data type it represents) and it’s associated variable (actual instantiation of the data at runtime) for primitives can be ignored as Java does not allow the data type of the variable referenced by the identifiers for primitives to be changed. For those that did not know this distinction and thought they were one and the same you now know different and can continue ‘not knowing’ as you, in a way, kind of ‘now knowing’, are now in a sense, a wise guy.
Next time, we’ll talk about the other mechanism (which is a ‘design’ in Java) to ensure that objects are type safe in Java: the runtime data type tag.

No comments: