Archive for January 2011
Tensors in terms of components III: non-orthogonal bases
Non-orthogonal or oblique axes
If you have been following our sequence of posts concerning the basics of tensors, you know that we are starting from Euclidean tensors, with components written in terms of Cartesian (orthonormal) basis vectors, that is, basis vectors with the following characteristics:
,
.
The equations above mean that the basis vectors are linearly independent. Now let us consider a more general set of basis vectors. This new set must also be independent and cannot be coplanar. But there are no further restrictions on this new basis. It is a non-orthogonal basis.
Let us write that new, non-orthogonal or oblique basis as the set. The vectors of this set span the space, so we can also expand an arbitrary vector
in terms of that basis vectors.
Note: until now, we used to write the components of a vector with subscript indexes, e.g.or
. Now we will change that notation, and write the components of vectors with superscripts instead. Why we do that will be clear at some point, but for the moment it is important to make this distinction. This notation will only be valid for the components of a vector expanded in an orthogonal or non-orthogonal basis. Below we will add a new notation for another case.
Hence, we have:
The Reciprocal Basis
Notice that since
,
we no longer have a simple and nice expression for the dot product, because now we have:
.
What do we do with?
Well, we work out the dot product of the basis vectors the same way as usual, but now you must consider the cosine dependence, which before was just 0 or 1. For that reason the dot product between vectors in an oblique basis is no longer as nice as .
But we can make it look nice again by introducing the reciprocal basis.
So let us see how this is possible. Define the reciprocal basis vectorsas:
.
Notice that the reciprocal basis vectors have superscripts. So what do we have until now? Vector components are written with superscripts, orthogonal and non-orthogonal basis vectors are written with subscripts, and the reciprocal basis vectors are written with superscripts, and are defined in the equation above. Notice that you need the non-othogonal basis vectors to define the reciprocal basis vectors.
What does that definition mean? Well, in 3D you are able to see that, say, the reciprocal vector is perpendicular to
and
. Therefore, the former vector is somewhat, but not exactly, in the direction of
, and so on. Its magnitude is set so that the dot product normalizes to 1:
.
We call the basis that is reciprocal to the reciprocal basis as the direct basis.
Recovering the nice dot product form
Now we expand an arbitrary vector in terms of the reciprocal basis vectors. Notice, however, that since these are written with superscripts, we write the vector components as subscripts when expanding in terms of the reciprocal basis vectors. We do that because we want to use the balance of up/down of indices as a mean to contrast both representations, and this will make our life easier later on. So we write:
,
.
Therefore,
,
.
There you have, a nice dot product again.
Note: You could also have written as:
.
Revisiting Einsteins Summation Convention
Now that we have up and down indices, we settle that, for now on, when talking about non-orthogonal bases, like indices in any given term must be summed whenever one is up and the other is down. So if you end up with an expression like, it is clear that you have made a mistake.
Ref.: [Nea10]. Read the rest of this entry »
Online Classes Resources
There is an interesting site for students looking for online material and tips, so that they can work towards their interests by following freely available, high-quality material, written by professional scientists. According to that site,
OnlineClasses.org endeavors to be your comprehensive source on the best programs available online, as well as a resource for study tips, frequently asked questions, and other relevant information that will make your online education experience the best that it can be.
Toy Universes has been featured in their recent compilation “50 Awesome Blogs for Physics Geeks“. There you can find many interesting sites and useful blogs in physics. We are happy to be included among great, consolidated blogs. Thank you!
More details on two views of tensors
Note:
We adopt any of the following notations for the Cartesian (orthonormal) basis vectors and for the components of a vectorin that basis, respectively:
,
,
,
,
,
,
and similarly for the corresponding primed (transformed) basis and vectors. We may use these notations interchangeably in this blog.
End of Note.
In the previous post, we have deduced two useful expressions for defining a tensor T of rank 2, namely (recall we are using Einstein’s summation convention):
1) ,
whereare the
and
components, respectively, of the vectors
in the orthonomal basis
, with indices
running from 1 to 3;
2) , with
,
whereis a set of real numbers that gives a rule for the transformation from the unprimed to the primed basis vectors.
Let us review these expressions now in a little more detail. Since they are simply two ways of defining a tensor of rank 2, let us generalize them for a tensor T of rank r, defined in a space ofdimensions (for the moment, we are addressing Euclidean tensors). Such a tensor is defined, therefore, either as:
1) a multilinear functional:
,
with the indicesrunning from 1 to N. Therefore the expression above involves
sums (one for each index).
2) a set ofquantities, which transform as:
,
again with .
You may have recognized definition (2) above as a generalization of a very simple situation in 2D Euclidean space, in which we want to find how the components of a vector (namely, a tensor of rank 1) transforms as we rotate the basis by an angle. See Fig. 1:

Fig. 1 - Rotation of basis by a fixed angle (theta). Orthonomal basis vectors (primed and unprimed) are shown in the left.
It is easy to show that:
,
,
and given that
we have:
,
.
Above, in the argument of the cosines, by, say, we mean the angle between the axis defined by
and
(old or unprimed axis, new or primed axis).
To avoid cumbersome writting, we simply define:
,
with both indicesrunning from 1 to 2. With this notation we mean that, say:
, etc.
Therefore, the first “slot” in the cosine argument is always the unprimed (old) axis, and the second slot, the primed (new) axis, with the corresponding number indicating whether it is the(1) or
(2) axis. Notice that these cosines are the direction cosines between two given basis vectors. Since the basis vectors are unit vectors, it means that their dot product are also given by these direction cosines.
We write, then, compactly, the above set of equations relating the transformation of the components of the vector:
.
Compare that expression with the relation found for the tensor of rank 2 transformation law (definition 2 above):
,
or with the general relation for a tensor of rank r:
.
We immediately recognize that the setis related to the cosines of the angles between the old and new coordinate axes, in the case where the coordinate transformation is an orthogonal transformation. This is usually the case in basic applications in Euclidean spaces.
Now we go back to definition (1) above, namely, that a tensor can also be defined a multilinear functional of the form:
.
In order to make a connection of the above with what we know from usual 3D vector analysis, we consider a vector,, and its projection on a fixed direction, determined, say, by a fixed unit vector
. Let us call that projection by
. See Fig. 2.
The direction cosines associated withare the cosines of the angles between that direction and each Cartesian axes:
.
So one can express the projection by in terms of the components of
in the Cartesian basis (always recall Einstein’s summation convention):
.
Notice that the projectionis independent of the choice of basis. If you change the basis, the direction cosines will change of course, as well as the components of
, but the projection of
on the direction
will give the same result always — we say that
is an invariant. This is therefore a geometric property that must be naturally preserved under such coordinate transformations.
So, you see, a vector can be also be defined as an invariant linear functional of one direction, consistent with what we have been stating previously, and this can be seen clearly from the deductions above, namely, if we write the projection of that vector on a fixed direction in terms of its components and direction cosines of that fixed direction.
A bilinear functional given in (1) above, that is,
,
can therefore be seen as an invariant bilinear functional of two directions defined by. Clearly, the components of these vectors actually represent the direction cosines when conveniently normalized.
On the other hand, if a vectoris supposed to be a tensor of rank 1, then according to the above, we should identify:
,
so that when we “apply”to the arbitrary, fixed unit vector
, we get:
.
But
,
therefore
,
which is, of course,
,
the dot product between vectorsand
. This is the projection of vector
in the direction given by
, which we have seen, is an invariant (independent of the coordinate system used). Indeed, the dot product between two vectors is an invariant , that is, a scalar, under orthogonal transformations.
In summary, the present post clarifies the two notions of tensors, as stated in itens (1) and (2) above, checking the overall consistency with simple examples from Euclidean vector analysis.
Tensors in terms of components II: changing basis
In our previous post, we arrived at the conclusion that, in a 3D Euclidean space, we need a set of 9 real numbers to specify a tensor: we can express the action of a tensor as amatrix defined by those 9 numbers. We also learned that these numbers depend on the basis chosen.
One the other hand, we have stressed several times that a tensor by itself is a geometrical object, whose properties are independent of the choice of a coordinate basis to describe it. Imagine a triangle. It has its own properties (like relations and proportions between angles, sides, etc.) without resorting to a coordinate system. It is a geometrical object by its own right. It is of course very useful to describe a triangle using analytic geometry, in which coordinates are adopted. One view or the other is a question of context.
We continue here to review tensors in terms of components, but we will not loose sight of the fact that they are geometrical objects. The geometrical view will be clarified in future posts, specially when we go deeper into differential geometry. For the moment, let us explore in this short note how the components of a tensor change when we change the basis.
We will first prove here that each componentof the tensor T in fact satisfies:
.
That is a useful result. It states that the components of the tensor T can be found easily when you use the fact that the later is a machine with two slots that can be filled with each combination of the basis vectors ; in that case, it outputs a real number, which just corresponds to one of its components in that basis. We shall prove that result first and use it for expressing how the components of T change when we change the basis.
To that aim, if you recall the representation theorem for linear functionals (introduced in this post; see also the Theorems page in the side bar), you know that the bilinear functional above is associated with a vector-valued function:
,
whereare arbitrary vectors. Writing these in terms of components, using the basis vectors
, we have (recall that we use the summation convention):
,
.
From linearity, we also have:
.
Then, from linearity as well:
.
But, at the same time, we also have that:
.
Note that we have used above the relation stated in our previous post, namely (re-labeling the indices):
.
From the orthonormality of the basis vectors,
,
whereis the Kroenecker delta, we find from the preceding equations that:
.
Collecting the results above,
,
therefore proving that:
.
Now, suppose you change to a new (primed) basis:
,
where is a set of real numbers expressing the rule on how the basis change. (As an example, you may suppose a rotation of the basis. In that case, the set of
‘s compose a rotation matrix).
On the other hand, you also have the same bilinear rule in that new basis, namely:
.
That means:
.
From linearity:
,
.
So you have the final solution. You know how the components of T change from a change of basis.
Ref.: [Nea10].
Tensors in terms of components
Based on what we have stated up to now concerning tensors, you must be eager to learn how to make the connection between the index-free and full-index languages. This post aims to serve as a first connection between both languages. You may find out, however, that the connection will come in a very natural way: basically from the linearity property of tensors. For now, our present exposition will be as simple as possible.
We are working with tensors living in “simple” spaces, namely, Euclidean tensors. Recall from previous posts that we have extended the definition of tensors to include multilinear functionals. So the picture we have up to now is that a tensor is like a machine with a certain number of slots to be filled with input vectors, and that, after “processing” these input vectors, the machine retrieves another tensor of a given rank (0, 1, 2,…), depending on the nature of the tensor.
For example, recall the 3D Euclidean metric , given by the Pythagorean formula:
.
The situation above is that you have two points,, in a space
, with the metric
attached to it, so that the metric function gives you a rule to calculate distances between these points. The coordinates (ordered triples)
of each point can be compactly written as a vector
, that is, a geometrical object which, under the Cartesian coordinatization, has components
. We agree that all such “position” vectors are “attached” to the origin
(their “tail” is at the origin, and their “head” is at the given point. This is just a useful image for vectors in Euclidean spaces. In general manifolds, we will have to give up on that vision occasionally).
Therefore, the metric is nothing but a linear machine with two slots that outputs a real number. It is a tensor of rank 2, because we have two input vectors and one output scalar, and a scalar counts as zero for the computation of the rank (as already noted in a previous post).
To emphazise our picture of tensors as machines, we shall temporarily use the following (unorthodox) notation for tensors: given a tensorof rank
, it has
input slots (operates on
vectors
) and outputs another tensor of rank
. We write the tensor
as:
.
For example, the metric tensor (rank 2) is written as:
,
where is a scalar.
Suppose now another tensor, also of rank 2, but now its nature is a bit different from the former:
.
That tensor receives as input the vectorand outputs another vector,
. Let us examine the above expression carefully. You can change perspective and see that the vector
is in fact a vector-valued function of the vector
:
.
But the vectorcan be written in terms of the three orthonormal basis vectors (i.e., mutually ortogonal and of unit length),
:
,
where the triple of real numbers () are the coordinates of the “head” of the vector
(i.e., its components). Insert that into the previous equation, and you have:
.
Because of the linearity property of tensors, the above equation is:
.
But each of the objectsis a vector, right? That means you can also expand them in terms of basis vectors, as you did with
. Let us write each expansion as (*):
,
,
.
From the above expansions, you can see clearly that you need all these 9 numbers,, in order to express the set of objects
in terms of the basis
. In fact, you can see now that the tensor T, if to be seen in component form under a given basis, needs a specification of 9 numbers, or its 9 components. That happens because you must specify how the tensor T acts on each basis vector. You must also realize that these 9 numbers do depend on the chosen basis. Also, it is evident that in a 2D space, you would need 4 of such numbers (
); in a n-dimensional space,
numbers.
Now, let us go back to the expression:
.
Clearly, it is also true that you can write the vectoras:
.
So, we have:
.
But we have already deduced above that:
.
Or:
.
Rearrange that in the following way:
.
Equate each component of at the left side with the corresponding expression in parenthesis in the left side. You get:
,
,
.
Evidently, you can write that system of equations compactly by collecting the components of T as a matrix operating on the given vector:
.
Now, let us change notation in order to avoid cumbersome writting. Just letbe labeled as numbers,
. So it is easy to see that the system of equations above can be written compactly as:
,
with indicesrunning from 1 to 3. We shall often use the Einstein’s summation convention (see Notation page on the side bar). So that can be written simply as:
.
In summary, we see thatgets as input the vector
and outputs vector
. In the chosen basis
the components of these vectors are related by the equation above.
It is left to you to show (see a little bit above, (*)) that:
.
But that is simply an expression for a new transformed basis:
.
We have above precisely a law for the transformation of the basis vectors by the action of the tensor. See the inversion now: for the transformation of the basis instead of
for the transformation of vectors.
You may be wondering how all that came out right. Well, we simply made the right choice of notation when writing the components of T — see (*) above — because we wanted all to be consistent at the end.
A note on matrix representation of tensors
If you have been following our previous posts on tensors, you have probably noticed that we have essentially signaled a few general and abstract remarks about tensors, and have not made any reference to more concrete techniques on how to compute with them (like, e.g., a reference to tensors as matrices). That fact is certainly one of the first disturbances that specially bothers those who came from a physics-like background. It certainly disturbed me (and still does).
For that reason I have decided to write this short note, motivated by a reader with an engineer background (see a brief clarification in a subsequent comment). To relief that natural distress, at least for a moment, we must state at this point that matrices can represent tensors.
So let us have this straight, at least as much as we can from what we have already learned. What we have been saying here is that, more generally, tensors are multilinear functionals: machines (in the language of [MTW73]) with a certain number of slots that may be completed or partially filled with inputs (namely — up to this point — vectors), giving as output another tensor (either of rank 0, 1, etc). The result depends on the nature of what the machine “operates” on.
For example, we have the Riemann tensor, so much used in General Relativity: it can be thought of as a machine with three slots (so it operates on three vectors), giving as output a vector. We will go into what that means when time comes, with both coordinate (index) language and free-index language. (Notice that in our abstract setting, up to now, we are using a free-index language. No reference to coordinates.) Just hang on!
Now, matrices. Tensors can be represented by matrices. The set of numbers that constitutes a matrix depends on the coordinate system used. (A few tensors can be represented by matrices that do not depend on the coordinate system, e.g., the isotropic tensors. But, generally, matrices representing tensors depend on the basis used, i.e. the coordinate system).
But whatever the notion that you carry on tensors, it is important to realize that a tensor is a geometrical object that exists independently of the coordinate system. So when you represent a tensor by a matrix, that is, by fixing a given basis, that matrix is a set of given numbers. When you represent the same tensor in another coordinate system, the associated matrix will in most cases be a set of completely different numbers. Because tensors — purely as they are (not when represented by matrices) — do not depend on the coordinate system used, it is clear that when you see tensors as matrices the transformation law that connects both coordinate bases will have an important role. That is why that, in index notation, tensors are usually defined as objects that transform in “such and such” way.
But, see, Nature does not care about coordinates that we humans use. A given phenomenon will happen regardless of the system we choose to make measurements. This seems a little dumb, but it is the essence of all mathematical tower to be constructed here. If an apple falls to the ground, we must express that phenomenon using a universal law, valid in any reference system. Such laws are generally expressed by tensor equations. Tensors are geometrical objects, independent of coordinates, so that when you say that a tensor gives zero (that is a tensor equation, right?) at some frame () , that must be true in any other frame (
) — they have the same form. (You should not come up with, say,
…) Such equations with the same form in all equivalent reference systems are an expression of the universality of the law they represent. That’s the principle of general covariance. Specific numerical values of physical quantities may be different in different frames, but the tensor equation has the same form.
For that reason, an appropriate relation between frames must provide an expression that connects exactly the measured quantities related to the given phenomenon, since they do differ from coordinate frame to coordinate frame. That includes the matrix representation of tensors.
But do not suppose from that statement that such relations between frames are only relevant when we make specific computations!
For instance, in special relativity, quantities measured in one inertial frame are connected by quantities measured in another inertial frame by the Lorentz transformation. That transformation can be regarded as a matrix as well. But a matrix is just a way to see it. What it does express is a fundamental symmetry of space-time (manifested essentially by the constancy of the velocity of light as measured by different observers). It forms a group (and this is a huge and fundamental subject that will be able to address at some point, hopefully!). All that does not need a coordinate to have a meaning.
As we proceed, we will be able to return to what have been stated here, in much more detail. For now, in summary, tensors can be represented by matrices, which are a useful concept in calculations. But we want to see them not only as a set of numbers, as we could easily lose sight of the forest for the trees.
Functions, Functionals and Tensors
Functions.
A function is a very basic concept in mathematics, but being basic doesn’t mean it is not a source of confusion. The best place to learn what a function is, at least the best that I know of is [Ten85]. That book is not about functions per se, but a one on ordinary differential equations that starts from the very beginning and develops into a huge, complete book on the matter. If you have not access to that book (although it is cheap and I fully recommend it, because you will always need a good book on differential equations anyway), take a look for the moment on the link by Wikipedia above to clear up a bit your memory or knowledge about functions.
We will need the notion of functions because we will need the notion of a functional. And we will need the notion of a functional because (if you have been following our previous posts) we have been investing our time and energy to first clear up the notion of a tensor as much as possible. (And why is that? Well, because cleaning up concepts is one of the philosophies that drives this blog!)
Functionals.
You may start thinking of a functional as a function of a function. That is, a function that takes as its argument a function and results in a number (a scalar function). A good physics example can be found in the variational principle formulation of classical mechanics of point particles: the action functional,that is, it takes as input a function of time — the generalized coordinate of a particle of the system at a given time,
This function, chosen amongst all possible ones, is that which minimizes the action, and gives therefore the actual trajectory of the particle between two instants of time.
One interesting point here is that the argument of a functional, that is, a given function, can also be considered itself a vector, if that function is taken from a function space that happens to comply with the properties of a vector space (recall the axioms that define vector spaces). So a functional can be a function of a vector. Or of a set of vectors.
Let us further talk about a particular kind of functional, a linear functional, that is, one which satisfies the usual linearity requirement:
where and
are arbitrary real numbers, and
and
are arbitrary vectors. That is, it doesn’t matter whether you first multiply vectors by numbers and sum them, and only then evaluate the functional of that result, or evaluate first the functional of each vector and then multiply each functional evaluation by the numbers. The result is the same, a real number or scalar.
As an example, take the following linear functional:
whereis a fixed vector and the operation above is the scalar product of vectors. You can immediately see that
is a functional and that it is linear.
Now a very important point that we want to make is summarized in the representation theorem for linear functionals (see our Theorem page at the side bar), which states that ifis a linear functional, then there is a unique vector
such that
for all
. All linear functionals can be precisely put into this form!
What does it mean? It states that any linear functional of a certain vector can be written as the dot product of that vector with another fixed, unique vector, which can be found appropriately. But it means much more! You can of course generalize that for as many vector arguments you want, and you will get a multilinear functional. That gives some spin-offs, and one of them certainly regards tensors.
Tensors.
So now, let us bring into the discussion the notion of a tensor, from the point of view of “simple tensors” (we will generalize that further in future posts). They are simply defined as linear vector-valued functions of one of more vectors (that is, being linear in their arguments). That is, functions that maps vectors to a vector in a linear way.
Functionals and Tensors.
So we have at this point a pretty abstract definition of a tensor (a linear vector-valued functions of one of more vectors) at one side and a multilinear functional (a scalar function of functions, which in turn can be vectors) at the other side, but then you may be asking what is the connection between them?
Well, it is in fact simple to see the connection if you have really understood the implications of the representation theorem for linear functionals and have a good idea on how to apply them to get pretty interesting results. The clue is: through that theorem you can construct linear vector-valued functions of vectors (i.e., tensors) directly from multilinear functionals and vice-versa, and in fact generalize the notion of tensors, making a unified picture in which a tensor is a multilinear functional itself.
To see that, start with a bilinear vector-valued function of two vectors,. Hold, say,
fixed for a moment. Then under that condition it is clear that
defines a linear functional on the variable
, right? So if you apply the representation theorem for linear functionals to
, when
is held fixed, you may write that:
Notice, however, that since we have temporarily fixed,
depends linearly on
. Then what we are saying is that
defines a new linear vector-valued function of a vector, which we will call
:
.
But if that is linear vector-valued function of a vector, then it defines a tensor.
So, you see, that game is also true if you had started from the tensorand wanted to get the bilinear functional
. You can also generalize that to multilinear functionals as well as vector-valued functions of several variables in one direction or another.
There is then a close association between tensors (as strictly defined as linear vector-valued function of vectors) and linear functional — back and forth — through the representation theorem for linear functionals. It is very natural that we want to extend the notion of a tensor to include multilinear functionals.
The rank of a tensor is the sum of the number of arguments and the nature of the output. So if you have a tensor that maps three vectors and as gives as output one vector, then it is a tensor of rank 4.
Then, in that previous example, the tensoris a linear functional that takes as its arguments 1 vector and delivers 1 vector, the so-called tensor of rank 2 — it maps a vector to a vector (can you imagine a map that rotates a vector through an angle?). A tensor
is of rank 2, that is, it maps 2 vectors to a scalar (that scalar “counts” as a zero); recall the metric function
? Notice that a linear scalar function (one that takes as argument a scalar and delivers a scalar) is a tensor of rank 0.
You can play around with tensors so redefined, like, e.g., a tensor that gets as argument, say, X vectors and delivers, say, Y vectors.
Generalizations.
As a final word, recall that we are talking about Euclidean tensors. We will generalize that notion to tensors living on more general spaces later. Then it will happen that a tensor will take as arguments not only vectors, but also other geometrical objects (one-forms), so that the inputs and outputs result in completely “mixed” tensors, with input geometrical objects living in different spaces. More tensor analysis on manifolds later.
So for the moment, a scalar is a tensor of rank 0. A vector is a tensor of rank 1. A tensor can be defined through a multilinear functional.
Ref.: [Nea10]

