Classifiers
This section provides definitions for the various classifiers used by the hm
software and shows how to save a classification to a comma-separated-value text
file. There are two kinds of classifiers: archaeological classifiers that
reflect observations and inferences made by an archaeologist familiar with the
stratigraphic situation represented by the sequence diagram; and graph
theoretical classifiers that take advantage of the DAG representation to deduce
structural relations that might prove useful to the archaeologist tasked with
interpreting the sequence diagram.
The hm
software currently implements the following classifiers:
(show-classifiers)
units
levels
phases
periods
adjacent
distance
reachable
Archaeological Classifiers
The hm
software currently provides three ways to integrate archaeological
observations about contexts into the sequence diagram, which it calls units
,
phases
, and periods
. Some preliminary discussion is in order because
archaeologists define these terms more or less precisely and have achieved more
or less consensus on their definitions. Also, there are very many other terms
in use that attempt to capture the practice of classifying and grouping
archaeological contexts.
Steve Roskams discusses this situation on pages 257–258 of Excavation in the context of assigning “basic units to higher-order categories.”
This raises the question of how such groups are to be defined—what is the point where one stops and another begins? Should there be a single type of group or a whole hierarchy of entities? How should each be named? Or are the potential configurations so diverse that it is impossible to make useful comparisons between sites?
This issue has been approached by a variety of specialists who have produced different schemes and terminology. To some extent the differences are a function of the great variety of sites with which they are dealing, and are thus entirely legitimate. Equally, they come, in part, from merely giving a different name to the same concept … This profusion of terminology is to be expected, given that thoughts on such matters are at a very formative stage, and it is probably right to retain some diversity in any case, given the variety of sites and associated stratigraphic sequences, and types of analytical procedures applied to them.
The units
classification offered by the hm
software is more precisely
defined and widely appreciated by archaeologists than either periods
or
phases
. The hm
software recognizes the basic distinction of depositional
from interfacial units of stratification and hard-wires this distinction into
the software, while leaving to the hm
user the choice of how it might be
represented in the sequence diagram.
The hm
software takes no such confident stand with periods
and phases
,
leaving their definitions (if not their names within the software) completely up
to the user. From the software’s point of the view, there is no difference
between a period
and a phase
and there is no relationship between them. The
hm
user is best served by viewing them as ways to incorporate any kind of
classification of contexts whatsoever. The classifications might be based on
statistical analysis of the finds collected from each context, presence or
absence of one or more index fossils, physical characteristics of the contexts
themselves, etc.
In a sense, the crucial question from the software’s point of view is how many
different kinds of classification might be useful on a single sequence diagram?
Are two kinds sufficient? The answer to this question does not involve judgments
about how many or few analyses are best carried out on excavation materials.
Rather, it involves how many classifications it might be useful to plot on the
same graphic. If there are a dozen classifications that need to be investigated,
then the way forward is to use the hm
software to make 6–12 sequence diagrams,
each one showing one or two classifications, as appropriate.
Units
The units
classifier is designed for use with excavation data, where both
depositional and interfacial units are identified and recorded. Archaeologists
universally identify and record depositional units (often as `layers’) but are
somewhat ambivalent about interfaces, typically choosing to record interfaces
that appear to be important and ignoring others.
This practice is potentially problematic for the hm
software because it
assumes that interfaces that are not identified and recorded do not themselves
represent chunks of time that might be important in a chronological model. In an
ideal sense, the stratigraphic sequence represents a continuous record of time
from some point in the past represented by the base of excavation up through to
the modern surface. Neglecting to identify and record interfaces potentially
yields a discontinuous model of that continuous record, with gaps introduced by
the time spans represented by the excluded interfaces.
The default configuration file uses node shape to distinguish deposits from interfaces. Deposits are shaped like a rectangular box and interfaces are shaped like a trapezium, as in this snippet from a configuration file:
[Graphviz sequence unit node shape]
deposit = box
interface = trapezium
Individual deposits and interfaces are identified in the input file for contexts (see Section ).
Periods and Phases
The stratigraphic DAG for this example shows the Figure 12 section divided into four periods (fig. 1). The oldest period includes the Natural ground and its surface. The next oldest period includes Context 9 and its surfaces. Next is the once-whole deposit represented by Contexts 7 and 8, the surfaces of those contexts, and all of the various contexts related to construction of the wall stub, Context 5. The youngest period is represented by the Context 1 deposit and its surface.
On a Linux system, Figure 1 can be reproduced with the following function call:
(run-project/example :fig-12-periods)
Graph Theoretical Classifiers
The graph theoretical classifiers are defined in the context of a general definition of a directed graph. The definition used here is paraphrased from the book, Structural Models in Anthropology written by Per Hage and Frank Harary. This book is the first in a remarkable trilogy that also includes Exchange in Oceania: A Graph Theoretic Analysis and Island Networks: Communication, Kinship, and Classification Structures in Oceania.
One of the challenges presented by the literature on graph theory is the number
of synonyms for the basic elements of a graph (table 1), what this
document and the hm
software try to label consistently as nodes and arcs. Hage
and Harary typically use point and arc for directed graphs, and point and line
for graphs. The definitions given below and in the following sections are quoted
verbatim, so it is up to the reader to translate synonyms appropriately.
node | arc |
---|---|
point | line |
vertex | edge |
junction | branch |
0-simplex | 1-simplex |
element | element |
Hage and Harary define a directed graph on page 68 of Structural Models in Anthropology as follows.
A directed graph or digraph D consists of a finite set V of points and a collection of ordered pairs of distinct points. Any such pair (u, v) is called an arc or directed line and will usually be denoted uv. The arc uv goes from u to v and is incident with u and v.
Adjacent
The adjacency classification is the simplest among the graph theoretical classifications. Following on the definition of directed graph, quoted above, Hage and Harary define adjacency as follows.
The arc uv goes from u to v and is incident with u and v. We also say that u is adjacent to v and v is adjacent from u.
It is important to note that the sequence diagram represents chronological adjacency, rather than physical adjacency. Two contexts that are in contact with one another can be said to be physically adjacent, but they might be chronologically separated by one or more other contexts.
This can be seen in the following example, which is based on the stratigraphic section shown on Figure .
The sequence diagram can be drawn to highlight the relationship of Context 2 to the others. When the sequence diagram is classified so nodes, node labels, and edges are colored according to adjacency from Context 2, such that the origin node is violet, adjacent nodes are cyan, and nodes not adjacent to Context 2 are orange, the graph in Figure 2 results.
The graph was produced on a Linux system with the following function call:
(run-project/example :roskams-h-solarized-dark)
The important sections in the initialization file include the following.
[Graph analysis configuration]
distance-from =
adjacent-from = 2
...
[Graphviz sequence classification]
node-fillcolor-by =
node-fontcolor-by = adjacent
node-shape-by =
node-color-by = adjacent
node-penwidth-by =
node-style-by =
node-polygon-distortion-by =
node-polygon-image-by =
node-polygon-orientation-by =
node-polygon-sides-by =
node-polygon-skew-by =
edge-color-by = adjacent
...
[Graphviz sequence graph attributes]
colorscheme = solarized
bgcolor = base02
fontname = Helvetica
fontsize = 14.0
fontcolor = base1
label = Roskams\' H-structure (corrected)
labelloc = t
style = filled
size =
ratio = 0.618034
...
[Graphviz sequence node attributes]
shape = box
colorscheme = solarized
style = filled
color = base01
fontsize = 14.0
fontsize-min = 6.0
fontsize-max = 22.0
fontcolor = base01
fillcolor = base03
...
[Graphviz sequence adjacent node colors]
origin = violet
adjacent = cyan
not-adjacent = orange
colorscheme = solarized
[Graphviz sequence adjacent node fontcolors]
origin = violet
adjacent = cyan
not-adjacent = orange
colorscheme = solarized
[Graphviz sequence adjacent edge colors]
origin = violet
adjacent = cyan
not-adjacent = orange
colorscheme = solarized
Note that Context 2 is in physical contact with Contexts 3 and 5. The sequence diagram (fig. 2) shows Context 2 adjacent to Context 3, but because the Context 2 pit cut into Context 3, which itself was cut into Context 5, Context 2 is not chronologically adjacent to Context 5.
Reachable
The graph theoretic definition of reachability requires definitions for directed walk and path. Hage and Harary define reachability this way on pages 68–69 of Structural Models in Anthropology, where they use digraph as a synonym for directed graph.
A (directed) walk in a digraph is an alternating sequence of points and arcs, v₀, x₁, v₁, …, xₙ, vₙ in which each arc xᵢ is vᵢ₋₁vᵢ. For brevity, we may write the sequence of points v₀, v₁, …, vₙ to indicate the same walk. The length of such a walk is n, the number of occurrences of arcs in it. A closed walk has the same first and last points, and open walk does not, and a spanning walk contains all the points. A path is a walk in which all points are distinct; a cycle is a nontrivial closed walk with all points distinct except the first and last. If there is a path from u to v, then v is said to be reachable from u …
Compare this definition with Roskams’ description of reachability as it is conceptualized by users of the Harris matrix on page 158 of the book, Excavation:
if one travels from a particular unit via the strands running up from it, all units through which one passes are provably later than that unit; if one travels down, every unit en route is provably earlier; any unit which cannot be reached in one of these two ways has no proven relationship with the unit in question.
This describes all the nodes from which the “particular unit” is reachable—”the strands running up from it”—and all the nodes reachable from “the particular unit” that are encountered when one “travels down”.
One key to dating Çatalhöyük has been identifying residual materials in the deposits. A good example of this is Context 1332+, where several seeds were recovered and dated. Initially, the seeds were thought to be associated with the Context 1332+ deposit, but four of them were subsequently determined to be residual. The effect of these different residuality determinations on the dating project at Çatalhöuyük is discussed by Dye and Buck.
The importance of Context 1332+ in the stratigraphy of Çatalhöyük is indicated
by a graph classified by reachability (fig. 3).
This graph assigns dark blue fills to nodes from which Context 1332+ is
reachable and nodes reachable from Context 1332+. The node for Context 1332+ is
filled light blue. Nodes from which Context 1332+ is not reachable and nodes not
reachable from Context 1332+ are filled green. The colors are taken from the
Brewer color paired
palette.
A stratigraphic DAG for the information displayed on Figure 3
might be drawn on a Linux system with the hm
software as follows:
(run-project/example :catal-hoyuk-reachable)
The important sections of the configuration file include the following.
[Graph analysis configuration]
distance-from =
adjacent-from =
reachable-from = 1332+
reachable-limit =
[Graphviz sequence classification]
node-fillcolor-by = reachable
...
[Graphviz sequence reachability node fillcolors]
origin = 0
reachable = 1
not-reachable = 2
colorscheme = paired
Classification by reachability is potentially important for decisions on what materials to date. Contexts that are reachable from many contexts or from which many contexts are reachable might be thought of as having relatively great potential influence in a chronological model.
Distance
The graph theoretic definition of distance follows directly on the definition of reachability. Hage and Harary define distance this way on page 69 of Structural Models in Anthropology.
If there is a path from u to v, then v is said to be reachable from u, and the distance d (u,v) from u to v is the length of any shortest such path.
An example of a graph classified by distance is Figure 4. This graph assigns node fills based on distance from Context 1332+. The node for Context 1332+ is the lightest blue. Nodes not reachable from Context 1332+ are the darkest blue. All other nodes have intermediate shades of blue based on their distance from Context 1332+, the farther the node the darker the color.
This graph was produced on a Linux system with the following function call:
(run-project/example :catal-hoyuk-distance :draw-chronology nil)
The interesting parts of the configuration file follow.
[Graph analysis configuration]
distance-from = 1332+
[Graphviz sequence classification]
node-fillcolor-by = distance
[Graphviz sequence node fill color schemes]
levels =
distance = cet-blues
Levels
The graph theoretic concept of level is closely tied to the stratigraphic concept of level. Hage and Harary define level on page 82 of Structural Models in Anthropology.
An acyclic digraph D has no cycles … [and] is said to have a level assignment, which assigns a positive integer n for each point v, called its level, if for each arc vᵢ vⱼ of D, the corresponding integers satisfy nᵢ < nⱼ. Thus each arc of D is directed from a lower to a higher level. If a digraph D has a cycle, then it cannot have a level assignment.
An example of a levels classification can be seen in the stratigraphic DAG of Çatalhöyük (fig. 5).
The graph was produced on a Linux system with the following function call:
(run-project/example :catal-hoyuk-levels :draw-chronology nil)
The interesting parts of the configuration file follow.
[Graphviz sequence classification]
node-fillcolor-by = levels
[Graphviz sequence node fill color schemes]
levels = cet-bgyw
Write a Classification to a File
In addition to creating graphical output, the hm
software can write a
classification to a comma-separated-value file to communicate with other
software. The name of the output file is specified in the configuration file;
this protocol insures that the user of an hm
project created by someone else
can be certain of a CSV file’s contents.
In the example below, the file names for distance
, reachable
, and adjacent
classifications are helpfully descriptive, while the name given the phases
classification is not too helpful.
[Output files]
sequence-dot = fig-12-polygon-1.dot
chronology-dot =
distance = fig-12-distance.csv
reachable = fig-12-reachable.csv
adjacent = fig-12-adjacent.csv
phases = foobar.csv
periods = fig-12-periods.csv
...
The following transcript writes the periods
classifier for the
:fig-12-polygon
example to the fig-12-periods.csv
file.
(defvar *seq*)
(setq *seq* (run-project/example :fig-12-polygon))
(write-classifier :periods *seq*)
The file fig-12-periods.csv
file looks like this:
head ../examples/fig-12-chronology/fig-12-periods.csv