Archive - nolano.ai

Introducing the Turbo LLM Inference Engine

Thrilled to introduce Nolano’s Turbo LLM Engine – Turbocharging inference latency for Large Language Models (LLMs).

Sep 21, 2023 •

August 2023

The LoRD (Low Rank Decomposition) of the Code LLMs

We release LoRDCoder models based on our novel method for compressing code LLMs that can be combined with quantization and pruning.

Aug 20, 2023 •

,

, and

March 2023

Int-4 LLaMa is not enough - Int-3 and beyond.

More compression, easier to build apps on LLMs that run locally.

Mar 13, 2023 •

LLMs running on your laptops and smartphones.

Building the Nolano community - Our vision for the future!

Mar 11, 2023 •

#nojs-banner { position: fixed; bottom: 0; left: 0; padding: 16px 16px 16px 32px; width: 100%; box-sizing: border-box; background: red; color: white; font-family: -apple-system, "Segoe UI", Roboto, Helvetica, Arial, sans-serif, "Apple Color Emoji", "Segoe UI Emoji", "Segoe UI Symbol"; font-size: 13px; line-height: 13px; } #nojs-banner a { color: inherit; text-decoration: underline; } This site requires JavaScript to run correctly. Please turn on JavaScript or unblock scripts