Module mwe

logicaffeine_language

Module mwe

Expand description

Multi-Word Expression (MWE) processing.

Post-tokenization pipeline that collapses multi-token sequences into single semantic units (e.g., “fire engine” → FireEngine).

§How It Works

The MWE pipeline runs between lexing and parsing:

Build a trie from known multi-word expressions
Scan the token stream for matches using apply_mwe_pipeline
Replace matched sequences with single tokens

§Supported MWE Types

Compound nouns: “fire engine”, “ice cream”
Phrasal verbs: “look up”, “give in”
Fixed phrases: “in order to”, “as well as”

§Key Functions

build_mwe_trie: Construct the MWE lookup trie
apply_mwe_pipeline: Transform token stream by collapsing MWEs

Structs§

MweTarget
MweTrie

Functions§

apply_mwe_pipeline: Apply MWE collapsing to a token stream. Matches on lemmas (not raw strings) to handle morphological variants.
build_mwe_trie: Build the MWE trie from lexicon data.