In computer programming, a variable or scalar is a storage location (identified by a memory address) paired with an associated symbolic name (an identifier), which contains some known or unknown quantity of information referred to as a value. The variable name is the usual way to reference the stored value, in addition to referring to the variable itself, depending on the context. This separation of name and content allows the name to be used independently of the exact information it represents. The identifier in computer source code can be bound to a value during run time, and the value of the variable may thus change during the course of program execution.
Variables in programming may not directly correspond to the concept of variables in mathematics. The latter is abstract, having no reference to a physical object such as storage location. The value of a computing variable is not necessarily part of an equation or formula as in mathematics. Variables in computer programming are frequently given long names to make them relatively descriptive of their use, whereas variables in mathematics often have terse, one- or two-character names for brevity in transcription and manipulation.
A variable's storage location may be referred by several different identifiers, a situation known as aliasing. Assigning a value to the variable using one of the identifiers will change the value that can be accessed through the other identifiers.
Compilers have to replace variables' symbolic names with the actual locations of the data. While a variable's name, type, and location often remain fixed, the data stored in the location may be changed during program execution.
Actions on a variable
In imperativeprogramming languages, values can generally be accessed or changed at any time. In purefunctional and logic languages, variables are bound to expressions and keep a single value during their entire lifetime due to the requirements of referential transparency. In imperative languages, the same behavior is exhibited by (named) constants (symbolic constants), which are typically contrasted with (normal) variables.
Depending on the type system of a programming language, variables may only be able to store a specified datatype (e.g. integer or string). Alternatively, a datatype may be associated only with the current value, allowing a single variable to store anything supported by the programming language.
Identifiers referencing a variable
An identifier referencing a variable can be used to access the variable in order to read out the value, or alter the value, or edit other attributes of the variable, such as access permission, locks, semaphores, etc.
For instance, a variable might be referenced by the identifier "" and the variable can contain the number 1956. If the same variable is referenced by the identifier "" as well, and if using this identifier "", the value of the variable is altered to 2009, then reading the value using the identifier "" will yield a result of 2009 and not 1956.
If a variable is only referenced by a single identifier that can simply be called the name of the variable. Otherwise, we can speak of one of the names of the variable. For instance, in the previous example, the "" is a name of the variable in question, and "" is another name of the same variable.
Scope and extent
See also: Free variables and bound variables
The scope of a variable describes where in a program's text the variable may be used, while the extent (or lifetime) describes when in a program's execution a variable has a (meaningful) value. The scope of a variable is actually a property of the name of the variable, and the extent is a property of the variable itself. These should not be confused with context (also called environment), which is a property of the program, and varies by point in the source code or execution – see scope: overview. Further, object lifetime may coincide with variable lifetime, but in many cases is not tied to variable lifetime.
A variable name's scope affects its extent.
Scope is an important part of the name resolution of a variable. Most languages define a specific scope for each variable (as well as any other named entity), which may differ within a given program. The scope of a variable is the portion of the program code for which the variable's name has meaning and for which the variable is said to be "visible". Entrance into that scope typically begins a variable's lifetime (as it comes into context) and exit from that scope typically ends its lifetime (as it goes out of context). For instance, a variable with "lexical scope" is meaningful only within a certain function/subroutine, or more finely within a block of expressions/statements (accordingly with function scope or block scope); this is static resolution, performable at parse-time or compile-time. Alternatively, a variable with dynamic scope is resolved at run-time, based on a global binding stack that depends on the specific control flow. Variables only accessible within a certain functions are termed "local variables". A "global variable", or one with indefinite scope, may be referred to anywhere in the program.
Extent, on the other hand, is a runtime (dynamic) aspect of a variable. Each binding of a variable to a value can have its own extent at runtime. The extent of the binding is the portion of the program's execution time during which the variable continues to refer to the same value or memory location. A running program may enter and leave a given extent many times, as in the case of a closure.
Unless the programming language features garbage collection, a variable whose extent permanently outlasts its scope can result in a memory leak, whereby the memory allocated for the variable can never be freed since the variable which would be used to reference it for deallocation purposes is no longer accessible. However, it can be permissible for a variable binding to extend beyond its scope, as occurs in Lisp closures and C static local variables; when execution passes back into the variable's scope, the variable may once again be used. A variable whose scope begins before its extent does is said to be uninitialized and often has an undefined, arbitrary value if accessed (see wild pointer), since it has yet to be explicitly given a particular value. A variable whose extent ends before its scope may become a dangling pointer and deemed uninitialized once more since its value has been destroyed. Variables described by the previous two cases may be said to be out of extent or unbound. In many languages, it is an error to try to use the value of a variable when it is out of extent. In other languages, doing so may yield unpredictable results. Such a variable may, however, be assigned a new value, which gives it a new extent.
For space efficiency, a memory space needed for a variable may be allocated only when the variable is first used and freed when it is no longer needed. A variable is only needed when it is in scope, thus beginning each variable's lifetime when it enters scope may give space to unused variables. To avoid wasting such space, compilers often warn programmers if a variable is declared but not used.
It is considered good programming practice to make the scope of variables as narrow as feasible so that different parts of a program do not accidentally interact with each other by modifying each other's variables. Doing so also prevents action at a distance. Common techniques for doing so are to have different sections of a program use different name spaces, or to make individual variables "private" through either dynamic variable scoping or lexical variable scoping.
Many programming languages employ a reserved value (often named null or nil) to indicate an invalid or uninitialized variable.
Main article: Type system
See also: Datatype
In statically typed languages such as Java or ML, a variable also has a type, meaning that only certain kinds of values can be stored in it. For example, a variable of type "integer" is prohibited from storing text values.
In dynamically typed languages such as Python, it is values, not variables, which carry type. In Common Lisp, both situations exist simultaneously: A variable is given a type (if undeclared, it is assumed to be , the universal supertype) which exists at compile time. Values also have types, which can be checked and queried at runtime.
Typing of variables also allows polymorphisms to be resolved at compile time. However, this is different from the polymorphism used in object-oriented function calls (referred to as virtual functions in C++) which resolves the call based on the value type as opposed to the supertypes the variable is allowed to have.
Variables often store simple data, like integers and literal strings, but some programming languages allow a variable to store values of other datatypes as well. Such languages may also enable functions to be parametric polymorphic. These functions operate like variables to represent data of multiple types. For example, a function named may determine the length of a list. Such a function may be parametric polymorphic by including a type variable in its type signature, since the amount of elements in the list is independent of the elements' types.
The formal parameters (or formal arguments) of functions are also referred to as variables. For instance, in this Python code segment,
the variable named is a parameter because it is given a value when the function is called. The integer 5 is the argument which gives its value. In most languages, function parameters have local scope. This specific variable named can only be referred to within the function (though of course other functions can also have variables called ).
The specifics of variable allocation and the representation of their values vary widely, both among programming languages and among implementations of a given language. Many language implementations allocate space for local variables, whose extent lasts for a single function call on the call stack, and whose memory is automatically reclaimed when the function returns. More generally, in name binding, the name of a variable is bound to the address of some particular block (contiguous sequence) of bytes in memory, and operations on the variable manipulate that block. Referencing is more common for variables whose values have large or unknown sizes when the code is compiled. Such variables reference the location of the value instead of storing the value itself, which is allocated from a pool of memory called the heap.
Bound variables have values. A value, however, is an abstraction, an idea; in implementation, a value is represented by some data object, which is stored somewhere in computer memory. The program, or the runtime environment, must set aside memory for each data object and, since memory is finite, ensure that this memory is yielded for reuse when the object is no longer needed to represent some variable's value.
Objects allocated from the heap must be reclaimed—especially when the objects are no longer needed. In a garbage-collected language (such as C#, Java, Python, Golang and Lisp), the runtime environment automatically reclaims objects when extant variables can no longer refer to them. In non-garbage-collected languages, such as C, the program (and the programmer) must explicitly allocate memory, and then later free it, to reclaim its memory. Failure to do so leads to memory leaks, in which the heap is depleted as the program runs, risks eventual failure from exhausting available memory.
When a variable refers to a data structure created dynamically, some of its components may be only indirectly accessed through the variable. In such circumstances, garbage collectors (or analogous program features in languages that lack garbage collectors) must deal with a case where only a portion of the memory reachable from the variable needs to be reclaimed.
Main article: Naming conventions (programming)
See also: Identifier and Namespace
Unlike their mathematical counterparts, programming variables and constants commonly take multiple-character names, e.g. or . Single-character names are most commonly used only for auxiliary variables; for instance, , , for array index variables.
Some naming conventions are enforced at the language level as part of the language syntax and involve the format of valid identifiers. In almost all languages, variable names cannot start with a digit (0–9) and cannot contain whitespace characters. Whether, which, and when punctuation marks are permitted in variable names varies from language to language; many languages only permit the underscore ("_") in variable names and forbid all other punctuation. In some programming languages, specific (often punctuation) characters (known as sigils) are prefixed or appended to variable identifiers to indicate the variable's type.
Case-sensitivity of variable names also varies between languages and some languages require the use of a certain case in naming certain entities;[note 1] Most modern languages are case-sensitive; some older languages are not. Some languages reserve certain forms of variable names for their own internal use; in many languages, names beginning with two underscores ("__") often fall under this category.
However, beyond the basic restrictions imposed by a language, the naming of variables is largely a matter of style. At the machine code level, variable names are not used, so the exact names chosen do not matter to the computer. Thus names of variables identify them, for the rest they are just a tool for programmers to make programs easier to write and understand. Using poorly chosen variable names can make code more difficult to review than non-descriptive names, so names which are clear are often encouraged.
Programmers often create and adhere to code style guidelines which offer guidance on naming variables or impose a precise naming scheme. Shorter names are faster to type but are less descriptive; longer names often make programs easier to read and the purpose of variables easier to understand. However, extreme verbosity in variable names can also lead to less comprehensible code.
- ^For example, Haskell requires that names of types start with a capital letter.
9. Variables and scoping #
9.1 Overview #
ES6 provides two new ways of declaring variables: and , which mostly replace the ES5 way of declaring variables, .
works similarly to , but the variable it declares is block-scoped, it only exists within the current block. is function-scoped.
In the following code, you can see that the -declared variable only exists inside the block that starts in line A:
works like , but the variable you declare must be immediately initialized, with a value that can’t be changed afterwards.
Since creates one binding (storage space for a variable) per loop iteration, it is OK to -declare the loop variable:
9.1.3 Ways of declaring variables #
The following table gives an overview of six ways in which variables can be declared in ES6 (inspired by a table by kangax):
|Hoisting||Scope||Creates global properties|
|Temporal dead zone||Block||No|
|Temporal dead zone||Block||No|
9.2 Block scoping via and #
Both and create variables that are block-scoped – they only exist within the innermost block that surrounds them. The following code demonstrates that the -declared variable only exists inside the block of the statement:
In contrast, -declared variables are function-scoped:
Block scoping means that you can shadow variables within a function:
9.3 creates immutable variables #
Variables created by are mutable:
Constants, variables created by , are immutable – you can’t assign different values to them:
Spec detail: changing a variable always throws a #
Normally, changing an immutable binding only causes an exception in strict mode, as per . But -declared variables always produce strict bindings – see , step 35.b.i.1.
9.3.1 Pitfall: does not make the value immutable #
only means that a variable always has the same value, but it does not mean that the value itself is or becomes immutable. For example, is a constant, but the value it points to is mutable – we can add a property to it:
We cannot, however, assign a different value to :
If you want the value of to be immutable, you have to take care of it, yourself. For example, by freezing it:
126.96.36.199 Pitfall: is shallow #
Keep in mind that is shallow, it only freezes the properties of its argument, not the objects stored in its properties. For example, the object is frozen:
But the object is not.
9.3.2 in loop bodies #
Once a variable has been created, it can’t be changed. But that doesn’t mean that you can’t re-enter its scope and start fresh, with a new value. For example, via a loop:
There are two declarations in this code, in line A and in line B. And during each loop iteration, their constants have different values.
9.4 The temporal dead zone #
A variable declared by or has a so-called temporal dead zone (TDZ): When entering its scope, it can’t be accessed (got or set) until execution reaches the declaration. Let’s compare the life cycles of -declared variables (which don’t have TDZs) and -declared variables (which have TDZs).
9.4.1 The life cycle of -declared variables #
variables don’t have temporal dead zones. Their life cycle comprises the following steps:
- When the scope (its surrounding function) of a variable is entered, storage space (a binding) is created for it. The variable is immediately initialized, by setting it to .
- When the execution within the scope reaches the declaration, the variable is set to the value specified by the initializer (an assignment) – if there is one. If there isn’t, the value of the variable remains .
9.4.2 The life cycle of -declared variables #
Variables declared via have temporal dead zones and their life cycle looks like this:
- When the scope (its surrounding block) of a variable is entered, storage space (a binding) is created for it. The variable remains uninitialized.
- Getting or setting an uninitialized variable causes a .
- When the execution within the scope reaches the declaration, the variable is set to the value specified by the initializer (an assignment) – if there is one. If there isn’t then the value of the variable is set to .
variables work similarly to variables, but they must have an initializer (i.e., be set to a value immediately) and can’t be changed.
9.4.3 Examples #
Within a TDZ, an exception is thrown if a variable is got or set:
If there is an initializer then the TDZ ends after the initializer was evaluated and the result was assigned to the variable:
The following code demonstrates that the dead zone is really temporal (based on time) and not spatial (based on location):
9.4.4 throws a for a variable in the TDZ #
If you access a variable in the temporal dead zone via , you get an exception:
Why? The rationale is as follows: is not undeclared, it is uninitialized. You should be aware of its existence, but aren’t. Therefore, being warned seems desirable.
Furthermore, this kind of check is only useful for conditionally creating global variables. That is something that you don’t need to do in normal programs.
188.8.131.52 Conditionally creating variables #
When it comes to conditionally creating variables, you have two options.
Option 1 – and :
This option only works in global scope (and therefore not inside ES6 modules).
Option 2 – :
9.4.5 Why is there a temporal dead zone? #
There are several reasons why and have temporal dead zones:
- To catch programming errors: Being able to access a variable before its declaration is strange. If you do so, it is normally by accident and you should be warned about it.
- For : Making work properly is difficult. Quoting Allen Wirfs-Brock: “TDZs … provide a rational semantics for . There was significant technical discussion of that topic and TDZs emerged as the best solution.” also has a temporal dead zone so that switching between and doesn’t change behavior in unexpected ways.
9.4.6 Further reading #
Sources of this section:
9.5 and in loop heads #
The following loops allow you to declare variables in their heads:
To make a declaration, you can use either , or . Each of them has a different effect, as I’ll explain next.
9.5.1 loop #
-declaring a variable in the head of a loop creates a single binding (storage space) for that variable:
Every in the bodies of the three arrow functions refers to the same binding, which is why they all return the same value.
If you -declare a variable, a new binding is created for each loop iteration:
This time, each refers to the binding of one specific iteration and preserves the value that was current at that time. Therefore, each arrow function returns a different value.
works like , but you can’t change the initial value of a -declared variable:
Getting a fresh binding for each iteration may seem strange at first, but it is very useful whenever you use loops to create functions that refer to loop variables, as explained in a later section.
loop: per-iteration bindings in the spec #
The evaluation of the loop handles as the second case and / as the third case. Only -declared variables are added to the list (step 9), which is passed to as the second-to-last parameter, .
9.5.2 loop and loop #
In a loop, creates a single binding:
creates one immutable binding per iteration:
also creates one binding per iteration, but the bindings it creates are mutable.
The loop works similarly to the loop.
loop: per-iteration bindings in the spec #
Per-iteration bindings in are handled by . In step 5.b, a new environment is created and bindings are added to it via (mutable for , immutable for ). The current iteration value is stored in the variable and used to initialize the bindings in either one of two ways:
9.5.3 Why are per-iteration bindings useful? #
The following is an HTML page that displays three links:
- If you click on “yes”, it is translated to “ja”.
- If you click on “no”, it is translated to “nein”.
- If you click on “perhaps”, it is translated to “vielleicht”.
What is displayed depends on the variable (line B). If we had used instead of in line A, there would be a single binding for the whole loop and would have the value , afterwards. Therefore, no matter what link you click on, you would always get the translation .
Thankfully, with , we get one binding per loop iteration and the translations are displayed correctly.
9.6 Parameters as variables #
9.6.1 Parameters versus local variables #
If you -declare a variable that has the same name as a parameter, you get a static (load-time) error:
Doing the same inside a block shadows the parameter:
In contrast, -declaring a variable that has the same name as a parameter does nothing, just like re-declaring a variable within the same scope does nothing.
9.6.2 Parameter default values and the temporal dead zone #
If parameters have default values, they are treated like a sequence of statements and are subject to temporal dead zones:
9.6.3 Parameter default values don’t see the scope of the body #
The scope of parameter default values is separate from the scope of the body (the former surrounds the latter). That means that methods or functions defined “inside” parameter default values don’t see the local variables of the body:
9.7 The global object #
- All properties of the global object are global variables. In global scope, the following declarations create such properties:
- Function declarations
- But there are now also global variables that are not properties of the global object. In global scope, the following declarations create such variables:
- Class declarations
Note that the bodies of modules are not executed in global scope, only scripts are. Therefore, the environments for various variables form the following chain.
9.8 Function declarations and class declarations #
- are block-scoped, like .
- create properties in the global object (while in global scope), like .
- are hoisted: independently of where a function declaration is mentioned in its scope, it is always created at the beginning of the scope.
The following code demonstrates the hoisting of function declarations:
- are block-scoped.
- don’t create properties on the global object.
- are not hoisted.
Classes not being hoisted may be surprising, because, under the hood, they create functions. The rationale for this behavior is that the values of their clauses are defined via expressions and those expressions have to be executed at the appropriate times.
9.9 Coding style: versus versus #
I recommend to always use either or :
- Prefer . You can use it whenever a variable never changes its value. In other words: the variable should never be the left-hand side of an assignment or the operand of or . Changing an object that a variable refers to is allowed:
You can even use in a loop, because one (immutable) binding is created per loop iteration:
Inside the body of the loop, can’t be changed.
- Otherwise, use – when the initial value of a variable changes later on.
- Avoid .
If you follow these rules, will only appear in legacy code, as a signal that careful refactoring is required.
does one thing that and don’t: variables declared via it become properties of the global object. However, that’s generally not a good thing. You can achieve the same effect by assigning to (in browsers) or (in Node.js).
9.9.1 An alternative approach #
An alternative to the just mentioned style rules is to use only for things that are completely immutable (primitive values and frozen objects). Then we have two approaches:
- Prefer : marks immutable bindings.
- Prefer : marks immutable values.
I lean slightly in favor of #1, but #2 is fine, too.
Next: 10. Destructuring