Warning: mkdir(): No space left on device in /var/www/hottg/post.php on line 59

Warning: file_put_contents(aCache/aDaily/2025-07-22/post/nlp_seminar/--): Failed to open stream: No such file or directory in /var/www/hottg/post.php on line 72
Excited to share that we have released RuBLiMP (Russian Benchmark of Linguistic Minimal Pairs) @исследовано
TG Telegram Group & Channel
исследовано | United States America (US)
Create: Update:

Excited to share that we have released RuBLiMP (Russian Benchmark of Linguistic Minimal Pairs), a novel benchmark for evaluating Russian language models (LMs).

RuBLiMP consists of 45,000 minimal pairs and includes 12 grammatical phenomena well-represented in Russian linguistics, covering morphology, syntax, and semantics. A minimal pair consists of a grammatical and an ungrammatical sentence (e.g., The cat is on the mat / *The cat are on the mat), and an LM is expected to prefer the grammatical one based on the scoring function.

Our approach allows to:
🔸generate minimal pairs at scale from any text domain
🔸estimate if a grammatical sentence appears in the LM's pretraining corpus

💡RuBLiMP can be used for evaluating the sensitivity of LMs to grammatical phenomena in Russian and for developing ranking and grammatical error detection methods.

🔸 Read more in our pre-print: https://arxiv.org/abs/2406.19232
🔸 HuggingFace: https://huggingface.co/datasets/RussianNLP/rublimp
🔸 GitHub: https://github.com/RussianNLP/RuBLiMP

Excited to share that we have released RuBLiMP (Russian Benchmark of Linguistic Minimal Pairs), a novel benchmark for evaluating Russian language models (LMs).

RuBLiMP consists of 45,000 minimal pairs and includes 12 grammatical phenomena well-represented in Russian linguistics, covering morphology, syntax, and semantics. A minimal pair consists of a grammatical and an ungrammatical sentence (e.g., The cat is on the mat / *The cat are on the mat), and an LM is expected to prefer the grammatical one based on the scoring function.

Our approach allows to:
🔸generate minimal pairs at scale from any text domain
🔸estimate if a grammatical sentence appears in the LM's pretraining corpus

💡RuBLiMP can be used for evaluating the sensitivity of LMs to grammatical phenomena in Russian and for developing ranking and grammatical error detection methods.

🔸 Read more in our pre-print: https://arxiv.org/abs/2406.19232
🔸 HuggingFace: https://huggingface.co/datasets/RussianNLP/rublimp
🔸 GitHub: https://github.com/RussianNLP/RuBLiMP
🔥10👍5


>>Click here to continue<<

исследовано






Share with your best friend
VIEW MORE

United States America Popular Telegram Group (US)


Warning: Undefined array key 3 in /var/www/hottg/function.php on line 115

Fatal error: Uncaught mysqli_sql_exception: Too many connections in /var/www/db.php:16 Stack trace: #0 /var/www/db.php(16): mysqli_connect() #1 /var/www/hottg/function.php(212): db() #2 /var/www/hottg/function.php(115): select() #3 /var/www/hottg/post.php(351): daCache() #4 /var/www/hottg/route.php(63): include_once('...') #5 {main} thrown in /var/www/db.php on line 16