Learning Nested Named Entity Recognition from Flat Annotations

Learning Nested Named Entity Recognition from Flat Annotations
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Nested named entity recognition identifies entities contained within other entities, but requires expensive multi-level annotation. While flat NER corpora exist abundantly, nested resources remain scarce. We investigate whether models can learn nested structure from flat annotations alone, evaluating four approaches: string inclusions (substring matching), entity corruption (pseudo-nested data), flat neutralization (reducing false negative signal), and a hybrid fine-tuned + LLM pipeline. On NEREL, a Russian benchmark with 29 entity types where 21% of entities are nested, our best combined method achieves 26.37% inner F1, closing 40% of the gap to full nested supervision. Code is available at https://github.com/fulstock/Learning-from-Flat-Annotations.


💡 Research Summary

The paper tackles the costly requirement of multi‑level annotations for nested named entity recognition (NER) by investigating whether models can learn nested structures using only flat (non‑overlapping) annotations. Four complementary approaches are explored on the Russian NEREL benchmark, which contains 29 entity types and 21 % nested mentions.

  1. String Inclusions: Substrings of existing flat mentions that match other mentions are added as pseudo‑nested entities. Both raw surface matching and lemmatized matching (to handle Russian inflection) are evaluated, substantially increasing inner‑entity recall.
  2. Entity Corruption: Long flat entities are deliberately corrupted by replacing a word with random symbols (digits, nonsense letters, mixed alphanumerics, punctuation). Five corruption positions (start, middle, end, random, syntactic root) and five symbol types are tested. Corrupting the end word consistently yields the best pseudo‑nested signal.
  3. Flat Neutralization: Instead of treating all sub‑spans of a flat entity as negative examples, the method neutralizes only those sub‑spans that match known entity surfaces (identified via the inclusion step). Neutral spans are excluded from the loss, reducing false‑negative pressure.
  4. Hybrid Fine‑tuned + LLM Pipeline: A fine‑tuned Binder model first predicts outer entities. Each outer span is then fed to a large language model (DeepSeek‑R1‑32B or RuAdapt‑Qwen2.5‑32B) which, via few‑shot prompting, predicts inner entities. A pure LLM baseline (zero‑, one‑, five‑shot) is also evaluated with various example‑selection strategies (random, most‑frequent, type‑balanced).
    Experiments show that string inclusions alone raise inner F1 from 3.84 % to 21.36 %. Adding end‑position corruption and flat neutralization pushes inner F1 to 26.37 %, closing roughly 40 % of the gap to a fully nested‑supervised model (≈66 % inner F1). Overall F1 improves modestly (≈73 % vs. 71 % baseline). The hybrid pipeline achieves a high overall F1 (70.16 %) but lags on inner entities (≈19 % inner F1), confirming current LLM limitations in handling fine‑grained nested structures across many types.
    The study demonstrates that inexpensive data‑augmentation (inclusions, corruption) and targeted training modifications (neutralization) can extract a substantial amount of nested information from flat corpora, offering a practical pathway for languages and domains where nested annotation is scarce. It also highlights that while LLMs excel at broad entity detection, they still struggle with hierarchical consistency, suggesting future work on multi‑stage or feedback‑driven integration of LLMs with fine‑tuned models.

Comments & Academic Discussion

Loading comments...

Leave a Comment