<body>

Node Structure and Types

Thursday, September 19, 2013
Nodes and Tags - The data in the FNA keys can be modeled using nodes, each of which can have
  • Parents and children
  • Tags that store values associated with the nodes
Node Structure - The following shows the structure of the data within each node:
The last three items in a node (the NodeEntry in the nodeList) are pointers to lists that contain a subset of other nodes. The parentList allows tracing each path back to the root node for a family. The childList allows tracing each path through each couplet in the key to all targets in the key; see A Key as a Hierarchy.  The targetList also allow tracing each path to all nodes, but not through couplets; it also allows tracing to nodes for the rare cases where the nodes are only used for naming (see Naming Hierarchy) or nodes are not in any key, but are on the taxa list for a parent taxon.

[Add new node types: Naming Only - Used to provide names when the keying hierarchy does not follow the naming hierarchy. In most cases, taxa that are naming only are singleton parents, but they can be a key base node, such as Askellia in the Flora of the Pacific Northwest (see Subkeys and Concatenated Keys).]

[need consistent usage of "node type" and "taxon type". Taxon Type is called Taxon Id Type in the table below, and is found by findNodeType.sh (node type was my old name for taxon type). The list of Node Types is below.]

[Modify to add Target Subset Index (pointer to a class subset for all keys that has the key's target list; these are only used in the key's base node or, if there are alternate keys, in the alternate keys' base node, which are shown in the Node Groups & Relationships figures), and a tag for taxa that are base classes for keys showing special types like targetsInKey (see 5/27/2015).] [Target Subset Index (tsi) creates a "naming hierarchy", which is usually the same as the "keying hierarchy".]

Node Types - The node type is specified in the tagListType field of each ClassEntry.  The node type descriptions below need to be read in conjunction with understanding how nodes are related, which is described in Node Groups & Relationships and in the Dual Role of Nodes section below.
  • Root - This is the base for each family, which contains the family name.  Most are base nodes for each family key, but a few families have only one species, so no key is involved and the root is a singleton parent.  Besides the family name, the node contains the unique taxon id assigned by FNA; also the FNA numbers each family (currently from 1 to 128), so this number is also contained in the root node (as the Target # in the table below).  Each family root node is level 1 in the taxon hierarchy. If a root node is the base node for a family key, the children are row 1 and 2 in that key, so the node has the dual role of being the first couplet in the key.  This is not the case if there are alternate keys for the family; instead the root node has the base node for each of these alternate keys as its children, and it has the dual role of offering a choice of the names of the alternate keys to the user.
  • Couplet - Couplet nodes document the decision points in the keys (see A Key as a Hierarchy).  The node has pointers to each of the two choices of attributes, but those attributes themselves are not contained in this node.  However, in this node is the attribute(s) for the choice that led to this couplet. For each couplet choice an attribute shows what locations (see Taxon Locations) taxa with paths through this choice are found; this also applies to terminal taxa. The level of this node is the same as that of the base node for this key.  As mentioned in Key Types and Subkeys, some keys have subkeys, each of which is given a name with a number or letter or with an intermediate taxon rank; for subkey couplet nodes, this name is stored with the couplet node (other types of couplet nodes do not have a name). An intermediate rank can also be associated with a choice that points to another couplet rather than to a subkey. Besides pointing to another couplet node, a couplet node can point to two types of target nodes, which are described next.
  • Target - This is the destination taxon, which was arrived at by a unique sequence of choices in the key; that is, there is a single attribute set in the key that has this taxon as its target.  This taxon will be at the next level compared to the level of the base node for this key.  In this node is the final attribute choice that led to this target.  Also in this node is the taxon name and number relative to the taxon in the key's base node; and there is a unique taxon id assigned by FNA. If a target node is the base node for a key, the node's children are row 1 and 2 in that key, so the node has the dual role of being the first couplet in the key. This is not the case if there are alternate keys at this level; instead the target node has the base node for each of these alternate keys as its children, and it has the dual role of offering a choice of the names of the alternate keys to the user. There is a rather special case where the target is an intermediate taxon that only has one child, so the intermediate name has to be stored with this type of target node as well as the taxon name of the child (this should not be called a subkey singleton because it is not a key and is not a singleton node, as used below). [To Do: Either (1) target nodes are one level above base & intermediate nodes only associated with couplets or (2) add additional nodes that show all immediate children of intermediate nodes, so can show hierarchy.]
  • Attribute Set - This node corresponds to one attribute set for a destination taxon that has multiple attribute sets (an attribute set is a sequence through the key leading to a target taxon; multiple-attribute-set targets were described in A Key as a Hierarchy). This node is similar to a target node, but since there are multiple paths to this destination taxon, there is also a special connector node that each of these attribute set nodes point to. Appended to the taxon name is an attribute set number in order to make the node name unique. There is no relative target number and taxon id since these are in the connector node.  Note that there can also be multiple attribute sets leading to subkeys, so in this case instead of a taxon name, the name is an intermediate taxon name.
  • Multiple-Attribute-Set Connector - This node is the destination taxon node pointed to by each of the multiple-attribute-set nodes; this connector node points back to each of those nodes (so to follow a path to the base node requires knowing which of the attributes sets was selected). For an example of a connector node see Effect of Single and Multiple Parents on Next-Level Keys in A Key as a Hierarchy. Like a target node, this node has the taxon name, relative target number and taxon id.  In addition, this node has the merge point for all of the attribute set paths; that is, all attribute sets have common attributes between the merge point node and the base node for the key; in some cases, the merge point and the base node are the same (there are no common attributes).  If the attribute set nodes are for an intermediate taxon, this connector node will be for that intermediate taxon also.  [Probably want to split this into Multiple-Attribute-Set Target Connector and Multiple-Attribute-Set Group Connector (or External and Internal Multiple-Attribute-Set Connectors) because in the table below Multiple-Attribute-Set Group Connector does not need Taxon Id and Target #.]
  • Alternate-Key Connector -  This node is needed whenever there are alternate keys (in addition to the alternate key node, which is discussed next).  In this case, a connector node is needed for each target in the alternate keys, so that as couplet choices are made that traverse down through any target, which of the alternate keys was used can be saved and when traversing back up through that target, the same alternate key is used in creating the attribute list for the terminal target. If there are multiple-attribute-set nodes in an alternate key, a multiple-attribute-set connector node is needed for each set, and each of those connector nodes is connected by an alternate-key connector node.
  • Alternate Key - An alternate key node connects a root or target node to all available alternate keys for the family or target. Currently only eleven alternate key nodes are needed, which are listed in Key Types and Subkeys. Each alternate key node is the base nodes for the corresponding key. Each is named so that the parent can offer a choice to the user which of the alternate keys they want to use.
  • Singleton - This node represents a singleton node, which results when a parent node only has one child that is at the next level, so no key is involved.  A root node, a target node or a connector node can be a singleton parent node.  A singleton node contains the taxon name, relative target number and taxon id.
  • Missing-Key Target - If a taxon has children, but the key to distinguish the children from each other is missing, then they can still be listed under that taxon.  For those that are not terminal, then lower keys may exist, so that a hierarchy can be shown with a gap for the missing key. A missing-key target is also used when a taxon is keyed at a level that is not normal for that taxon; e.g., there is no need for a subsp. key for Piperia elegans because all these subspecies are in the key for species at the Piperia level, but a missing-key target node is needed for Piperia elegans to show the hierarchy of names from Piperia to elegans and then to its two subspecies (see Naming Hierarchy) [so a better name for this node type may be "Target Not-In-A-Key"].
  • Segregate - A target in a key that is where a taxon used to exist before FNA moved it to a new location in the hierarchy. A link gives that new location, so that keying of a specimen can be resumed from there.
[In the table below "Multiple-Path" is used instead of "Multiple-Attribute-Set" - which is better?]
The following summarized the data contained in each type of node:

Node Type Name Rank Locations Row Label Characters Taxon Id Taxon Id Type Target # Merge Point Intermediate Name
Root





Couplet



Target
Multiple-Path Target




Multiple-Path-Target Connector



Multiple-Path-Group Connector





Alternate-Key Connector





Alternate Key







Singleton





Missing-Key Target





Segregate








See Node Groups & Relationships for a description of how nodes are combined to model the FNA keys.

[Add 11/4/2017 diagrams of the major node types: couplet, target, AS, connector.]

Dual Role of Nodes - The node type reflects the role of a node relative to its parent or to other nodes in a key.  Root, target, connector and singleton nodes also have another role relative to nodes at the next level; these roles are either
  • Couplet
  • Singleton Parent
  • Key Base
  • Offer alternate key choices
  • Terminal
These are not separate nodes, but nodes at the previous level acting in their other role.  In A Key as a Hierarchy, an example shows a row in a key can have dual roles in that key and the key at the next higher level. In Node Groups & Relationships the more general term Base Node is used for Singleton Parent or Key Base.

Tag List Creation - The following are general steps in creation of a tag list and tagListIndex that is used in a class entry:

  1. Create a new Tag object.
  2. Add that object to the tags array list, keeping track of the tag index.
  3. Create a new TagList object, using the tag index. keeping track of the tag list index.
  4. Get a reference to that TagList object.
  5. For each addition tag needed, create a new Tag object.
  6. Add that object to the tags array list, keeping track of the tag index.
  7. Add that tag index to the TagList object.
  8. Insert the tag list index obtained in step 3. in the class entry.


0 Comments:

Post a Comment

<< Home