nolano.ai
Subscribe
Sign in
Home
Archive
About
Introducing the Turbo LLM Inference Engine
Thrilled to introduce Nolano’s Turbo LLM Engine – Turbocharging inference latency for Large Language Models (LLMs).
Sep 21, 2023
•
nolano.ai
Share this post
Introducing the Turbo LLM Inference Engine
nolanoorg.substack.com
Copy link
Facebook
Email
Note
Other
August 2023
The LoRD (Low Rank Decomposition) of the Code LLMs
We release LoRDCoder models based on our novel method for compressing code LLMs that can be combined with quantization and pruning.
Aug 20, 2023
•
ayushkaushal
,
Irina Rish
, and
Tejas Vaidhya
2
Share this post
The LoRD (Low Rank Decomposition) of the Code LLMs
nolanoorg.substack.com
Copy link
Facebook
Email
Note
Other
March 2023
Int-4 LLaMa is not enough - Int-3 and beyond.
More compression, easier to build apps on LLMs that run locally.
Mar 13, 2023
•
nolano.ai
8
Share this post
Int-4 LLaMa is not enough - Int-3 and beyond.
nolanoorg.substack.com
Copy link
Facebook
Email
Note
Other
1
LLMs running on your laptops and smartphones.
Building the Nolano community - Our vision for the future!
Mar 11, 2023
•
nolano.ai
4
Share this post
LLMs running on your laptops and smartphones.
nolanoorg.substack.com
Copy link
Facebook
Email
Note
Other
Share
Copy link
Facebook
Email
Note
Other
This site requires JavaScript to run correctly. Please
turn on JavaScript
or unblock scripts