An Image Analysis Approach to the Calligraphy of Books

An Image Analysis Approach to the Calligraphy of Books
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Text network analysis has received increasing attention as a consequence of its wide range of applications. In this work, we extend a previous work founded on the study of topological features of mesoscopic networks. Here, the geometrical properties of visualized networks are quantified in terms of several image analysis techniques and used as subsidies for authorship attribution. It was found that the visual features account for performance similar to that achieved by using topological measurements. In addition, the combination of these two types of features improved the performance.


💡 Research Summary

The paper presents a novel approach to authorship attribution that combines traditional network‑based textual analysis with quantitative image‑processing techniques applied to visualizations of those networks. Building on earlier work that models texts as mesoscopic networks—where each node represents a window of Δ consecutive paragraphs and edges encode cosine similarity between tf‑idf vectors of the windows—the authors first construct weighted, undirected graphs for each document. To keep the graphs comparable, they prune low‑weight edges until every graph attains an average degree of 40, a threshold previously used in related studies.

After graph construction, the authors extract a suite of topological descriptors: 2‑hop and 3‑hop accessibility, node degree, backbone and merged symmetry (for radii 2, 3, 4), assortativity, average neighbor degree, and clustering coefficient. For each metric they compute summary statistics (mean, standard deviation, skewness) across all nodes, yielding a compact feature vector that captures the structural signature of the text.

The innovative contribution lies in the second feature stream. The mesoscopic graphs are rendered using a force‑directed layout (attraction between connected nodes, repulsion among all nodes) with fixed parameters, producing 2‑D visualizations that are then converted to binary images. A morphological preprocessing step—dilation followed by erosion—removes spurious holes; the dilation kernel size is adaptively set based on the principal eigenvalue of the image’s PCA, ensuring that larger objects receive proportionally larger kernels.

From these cleaned images the authors compute a diverse set of image‑based features:

  1. Geometric descriptors – total object area, perimeter, Euler number (objects minus holes), radius and centre of the minimum enclosing circle, convex‑hull area, hull perimeter, and residual area (convex‑hull area minus object area).
  2. Shape elongation – ratio of the first to second eigenvalues of the PCA of the object, reflecting how stretched the shape is.
  3. Lacunarity – a measure of the heterogeneity of void sizes, calculated using a self‑referential circular‑window method.
  4. Fourier‑domain descriptors – magnitude of the 2‑D Fourier transform is partitioned into non‑overlapping concentric rings; for each ring the mean, standard deviation, and Shannon entropy of the magnitudes are recorded.

These image features capture visual aspects of the network layout—such as the prevalence of long strings, loops, and hub‑like clusters—that are not directly encoded in the graph’s adjacency matrix.

For classification, the authors employ an unsupervised K‑Means clustering algorithm, setting the number of clusters to the known number of authors (10). They evaluate three configurations: (i) only topological features (NF), (ii) only image features (IF), and (iii) the concatenation of both (IF+NF). Using a dataset of 50 English texts from Project Gutenberg (five works each by ten classic authors, including Melville, Austen, Twain, Poe, etc.), they obtain the following accuracies: NF = 54 % (15 features), IF = 54 % (14 features), and IF+NF = 58 % (19 features). The modest improvement when combining the two feature families demonstrates that visual characteristics provide complementary information to pure network metrics.

Qualitative inspection of selected visualizations (Poe, Saki, Twain, Henry James) reveals author‑specific patterns: Poe and Saki produce sparse graphs with few nodes and short loops, reflecting their collections of unrelated short stories; Twain’s graphs are densely tangled with many long loops, suggesting recurrent narrative motifs; James’s works display a mixture of short loops and longer, more structured paths, possibly mirroring the focused, character‑driven nature of his stories. The authors discuss how such visual cues could influence the underlying topological measurements.

The paper acknowledges several limitations. K‑Means requires prior knowledge of the number of authors, which may not be realistic in real‑world forensic scenarios. The fixed average degree of 40 and the static force‑layout parameters may not be optimal for all corpora, and the sensitivity of image features to layout variations is not explored. Moreover, the classification framework remains unsupervised; supervised models (e.g., SVM, Random Forest, deep neural networks) could potentially leverage the rich feature set more effectively.

Future directions suggested include: (a) employing supervised learning with feature selection or dimensionality reduction to improve accuracy; (b) experimenting with alternative graph‑drawing algorithms (t‑SNE, UMAP, spectral layouts) and aggregating multiple visualizations to reduce layout bias; (c) integrating graph‑based convolutional networks with conventional CNNs to jointly learn from structural and visual data; and (d) conducting a systematic analysis of how pruning thresholds and Δ‑window sizes affect both topological and image descriptors.

In summary, the study demonstrates that the “calligraphy” of mesoscopic text networks—quantified through image analysis—offers a valuable, complementary perspective for authorship attribution. While the performance gains are modest, the methodology opens a new interdisciplinary avenue that bridges complex network theory, computer vision, and stylometry, encouraging further exploration of visual features in textual forensics.


Comments & Academic Discussion

Loading comments...

Leave a Comment