OpenDX - Documentation
Full Contents QuickStart Guide User's Guide User's Reference
Previous Page Next Page Table of Contents Partial Table of Contents Index Search

CategoryStatistics

Category

Transformation

Function

Calculate statistics on data associated with a categorical component

Syntax

statistics = CategoryStatistics(input, operation, category, data, lookup);

Inputs
Name Type Default Description
input field (none) field for which to compute statistics
operation string "count" operation to perform ("count", "mean", "sd", "var", "min", "max")
category string "data" component with categorical values
data string "data" data component for statistics
lookup integer, string, value list "category lookup" lookup component

Outputs
Name Type Description
statistics field field with data containing the statistics and positions for the category values

Functional Details

input

field containing the categorical and data components

operation

calculation to perform

category

component with categorical values. This component must be an integer type (int, ubyte, ...)

data

data component for statistics. This component must be scalar.

lookup

lookup component (optional)

CategoryStatistics calculates statistics on a scalar component associated with a categorical component. If the operation is "count", the data component is ignored and the number of counts in each category is calculated, corresponding to a histogram of the unique values in the categorized component.

For example, if input is a Field with component "state" containing the entries {1,0,1,2,3}, component "state lookup" containing the entries {"CA", "NY", "PA", "VA"}, and a component "sales" containing the entries {1.2,1.0,1.4,1.7,1.8}, then CategoryStatistics(input,"mean","state","sales") will produce an output field where the "positions" component will contain the indices {0,1,2,3} and the "data" component will contain the mean value for sales for each state, that is {1.0,1.3,1.7,1.8}.

The output of CategoryStatistics is a field with a "positions" component corresponding to the categorical indices, and a "data" component corresponding to the requested statistics. The "positions" component will consist of the integers 0 to N-1, where N can be determined in a number of ways:

Components

Creates an output field with a "positions" component representing the categorical indices, and a "data" component containing the requested statistics. Creates a "categoryname lookup" component if a lookup table is specified using the lookup parameter.

Example Visual Programs

Duplicates.net
Zipcodes.net

See Also

Categorize, Statistics, Lookup


Full Contents QuickStart Guide User's Guide User's Reference

[ OpenDX Home at IBM | OpenDX.org ]