Microsoft has announced that its Azure ND GB300 v6 virtual machine achieved an inference speed of 1.1 million tokens per second on Meta’s Llama2 70 B model.
CEO Satya Nadella took to X, calling it “An industry record made possible…

Microsoft has announced that its Azure ND GB300 v6 virtual machine achieved an inference speed of 1.1 million tokens per second on Meta’s Llama2 70 B model.
CEO Satya Nadella took to X, calling it “An industry record made possible…