Importing data into Data Explorer is the first step in creating a visualization of that data. Data Explorer supports the importation of a number of data formats: General Array Importer, Data Explorer native, CDF, netCDF, and HDF (see Appendix B. "Importing Data: File Formats" in IBM Visualization Data Explorer User's Guide). The General Array Importer is discussed here not only because it can import a variety of data types but because its supporting interface makes it useful to the broadest range of users. This interface consists of the Data Prompter, for describing the data to be imported, and the Data Browser, for viewing the data.
This chapter deals with the importation of data in the following sections:
|An Important Note on Fields|
Importing data into Data Explorer requires some knowledge of the Data Explorer data model and at least a working knowledge of a field.
Fields are the fundamental objects in the Data Explorer data model. A field represents a mapping from some domain to some data space. The domain of the mapping is specified by a set of positions and (generally) a set of connections that allow interpolation of data values for points between positions. Positions represent what can be thought of as (and often really are) locations in space; the data are the values associated with the space of the positions. The mapping at all points in a domain (not just those specified by the given positions) is represented implicitly by specifying that the data are dependent on (located at) the sample points or on the connections between points.
This simple abstraction is sufficient for representing a wide range of information. For example, you can describe 3-dimensional volumetric data whose domain is the region specified by positions and whose data space is the set of values associated with those positions. The domain of a 2-dimensional image on a monitor screen is a set of pixel locations, and the data space consists of the pixel color. For 2-dimensional surfaces imbedded in 3-dimensional space (e.g., traditional graphical models) the domain may be a set of positions on the surface, and the data space a set of data values on that surface.
In Data Explorer the positions and data are said to be components of a field, and every field must contain at least a "positions" component and a "data" component. Fields may also contain other components (e.g., "connections"). Thus a Data Explorer field consists of data and the additional components needed to describe that data so that Data Explorer can process it.
|An Important Note on Fields (cont.)|
Components are represented as arrays of numbers with some auxiliary information specifying attributes (e.g., type of data dependency). The syntax of defining fields in the General Array format is described in 5.3 , "Header File Syntax: Keyword Statements". The various components are described in IBM Visualization Data Explorer User's Guide.
To import data through the General Array Importer, you must be able to answer the following questions.
|Components and Variables|
In Data Explorer terminology, the values of the independent variable constitute the "positions" component of a data field. In the examples above, the first independent variable consists of locations in space and the second does not, but both would be represented as "positions" in a data field. The independent variable is always represented by the "positions" component.
The values of the dependent variable constitute the "data" component.
Figure 11. Examples
of Grid Types. The three grids in the top row represent surfaces; those
in the bottom row, volumes. Reading from left to right, the three types of grid
are: irregular (irregular positions, irregular connections), deformed regular
(irregular positions, regular connections), and regular (regular positions,
Figure 12. Examples
of Data Dependency. In the visualization on the left, data
correspond one-to-one with positions. Other data values (and colors) are
interpolated linearly between positions. In the visualization on the right, the
elements connecting positions are quads. Data (and colors) correspond one-to-one
with, and are constant within, each quad.
Note: The General Array Importer supports only regular connections (lines, quads, and cubes) or scattered data. For irregular connections such as triangles or tetrahedra, you can use the Data Explorer native format to import your data. (See IBM Visualization Data Explorer User's Guide.)
With the answers to these questions, you can now use the General Array Importer to describe your data.
Figure 13. Row-
versus Column-Majority Grids. The two grids shown here are generated
from the same data file, consisting simply of the numbers 1, 2, 3, ..., 20. The
associated header files differ only in the specification of the grids'
The General Array Importer uses a "header file" to describe the structure and location of data to be imported. This file consists of keyword statements that identify important characteristics of that data (including grid structure, format, and data type, along with the path name of the file containing the data).
A header file can be created with a text editor or, more easily, with the Data Prompter, which prompts for the necessary information. (See 3.3 , "Importing Data" for an example that uses the Data Prompter and 5.4 , "Data Prompter" for a detailed description of how to use it.) The Data Prompter also checks for incorrect syntax, such as conflicting keywords (see 5.3 , "Header File Syntax: Keyword Statements").
Once a header file has been created, the data it describes can be imported into Data Explorer by the Import module. To identify a header file to Data Explorer through the Import dialog box:
The General Array Importer imports ASCII or binary data that is organized in one of two general "styles": block or columnar. Block style requires that the data be organized in records, or blocks. Columnar style requires that the data be organized in vertical columns (see Figure 14).
Figure 14. Block and
Columnar Styles of Data Organization. The three horizontal data blocks
at left illustrate the block style; the three vertical columns at right, the
columnar style. A, B, and C represent separate variables.
The following set of FORTRAN I/O statements generate a record-style data file:
write(15,20) A(i),i=1,100 write(15,20) B(i),i=1,100 write(15,20) C(i),i=1,100 20 format(10(f10.3))An equivalent example in C is shown on the next page.
for(i=0; i<100, i++) printf("%10.3f",A[i]); for(i=0; i<100, i++) printf("%10.3f",B[i]); for(i=0; i<100, i++) printf("%10.3f",C[i]);
The following FORTRAN I/O statement produces a columnar-style data file:
write(15,10) (A(i),B(i),C(i),i=1,100) 10 format(3(2x,f10.3))An equivalent example in C is:
for (i=0; i<100; i++) printf(" %10.3f %10.3f %10.3f\n",A[i],B[i],C[i]);
For both the block and columnar styles, the information in the file can be positions as well as data. The data can be: