LLMLingua-2, Originally developed and implemented in Python by Microsoft, is a small-size yet powerful prompt compression method.
llmlingua-2-js, ported by atjsh and maintained by its-iris, is a pure JavaScript/TypeScript implementation of LLMLingua-2, designed to run in web browsers and Node.js environments.
You can try it on the GitHub Pages without any installation.
The source code for the demo is available in the examples directory.
This implementation depends on @huggingface/transformers. Please check their requirements to see if your environment supports inference.
Install the dependencies and the library:
npm install @huggingface/transformers@4.0 git://github.com/its-iris/llmlingua-2-js.git
You can choose between models based on your needs.
| Model | Size | Pros | Cons | Public Model |
|---|---|---|---|---|
| TinyBERT | 57.1 MB | Very small, fast |
Lower accuracy than larger models |
atjsh/llmlingua-2-js-tinybert-meetingbank |
| MobileBERT | 99.2 MB | Small, optimized for mobile |
Moderate accuracy, tradeoff in depth |
atjsh/llmlingua-2-js-mobilebert-meetingbank |
| BERT | 710 MB | Faster, smaller size |
Lower accuracy than XLM-RoBERTa |
Arcoldd/llmlingua4j-bert-base-onnx |
| XLM-RoBERTa | 2240 MB | High accuracy | Slower, slightly larger in size |
atjsh/llmlingua-2-js-xlm-roberta-large-meetingbank |
Learn More about the performance of each model (actual performance may vary).
For more details on how to use the library, please refer to the API reference documentation.
See LICENSE for details.
This software includes other software related under the following licenses: