時間:2024-02-29|瀏覽:310
Nvidia、Hugging Face 和 ServiceNow 正在通過 StarCoder2(一個新的開放訪問大型語言模型 (LLM) 系列)推動 AI 代碼生成的標準。
這些模型現已提供三種不同的規模,已經接受了 600 多種編程語言(包括低資源語言)的培訓,以幫助企業加速其開發工作流程中的各種與代碼相關的任務。
它們是在開放的 BigCode 項目下開發的,該項目是 ServiceNow 和 Hugging Face 的共同努力,旨在確保負責任地開發和使用大型代碼語言模型。
它們根據開放負責任的人工智能許可證 (OpenRAIL) 免版稅提供。
“StarCoder2 證明了開放科學合作和負責任的人工智能實踐與道德數據供應鏈的綜合力量。
最先進的開放訪問模型改進了先前的生成式人工智能性能,以提高開發人員的工作效率,并使開發人員能夠平等地享受代碼生成人工智能的好處,從而使任何規模的組織都能更輕松地滿足其全部業務需求ServiceNow 的 StarCoder2 開發團隊負責人兼 BigCode 聯合負責人 Harm de Vries 在一份聲明中表示。
StarCoder2:三種模型滿足三種不同需求
雖然 BigCode 最初的 StarCoder LLM 以一種 15B 參數大小首次亮相,并接受了大約 80 種編程語言的訓練,但最新一代的模型超越了它,具有三種不同大?。?B、7B 和 15B)的模型,并接受了 619 種編程語言的訓練。
據 BigCode 稱,新模型的訓練數據(稱為 The Stack)比上次使用的數據大七倍多。
更重要的是,BigCode 社區為最新一代使用了新的訓練技術,以確保模型能夠理解并生成 COBOL、數學和程序源代碼討論等低資源編程語言。
最小的 30 億參數模型是使用 ServiceNow 的 Fast LLM 框架進行訓練的,而 7B 模型是使用 Hugging Face 的 nanotron 框架開發的。
兩者都旨在提供高性能的文本到代碼和文本到工作流生成,同時需要更少的計算。
Meanwhile, the largest 15 billion-parameter model has been trained and optimized with the end‐to‐end Nvidia NeMo cloud‐native framework and Nvidia TensorRT‐LLM software.
While it remains to be seen how well these models perform in different coding scenarios, the companies did note that the performance of the smallest 3B model alone matched that of the original 15B StarCoder LLM.
Depending on their needs, enterprise teams can use any of these models and fine-tune them further on their organizational data for different use cases. This can be anything from specialized tasks such as application source code generation, workflow generation and text summarization to code completion, advanced code summarization and code snippets retrieval.
The companies emphasized that the models, with their broader and deeper training, provide repository context, enabling accurate and context‐aware predictions. Ultimately, all this paves the way to accelerate development while saving engineers and developers time to focus on more critical tasks.
“Since every software ecosystem has a proprietary programming language, code LLMs can drive breakthroughs in efficiency and innovation in every industry,” Jonathan Cohen, vice president of applied research at Nvidia, said in the press statement.
“Nvidia’s collaboration with ServiceNow and Hugging Face introduces secure, responsibly developed models, and supports broader access to accountable generative AI that we hope will benefit the global community,” he added.
As mentioned earlier, all models in the StarCoder2 family are being made available under the Open RAIL-M license with royalty-free access and use. The supporting code is available on the BigCode project’s GitHub repository. As an alternative, teams can also download and use all three models from Hugging Face.
That said, the 15B model trained by Nvidia is also coming on Nvidia AI Foundation, enabling developers to experiment with them directly from their browser or via an API endpoint.
While StarCoder is not the first entry in the space of AI-driven code generation, the wide variety of options the latest generation of the project brings certainly allows enterprises to take advantage of LLMs in application development while also saving on computing.
Other notable players in this space are OpenAI and Amazon. The former offers Codex, which powers the GitHub co-pilot service, while the latter has its CodeWhisper tool. There’s also strong competition from Replit, which has a few small AI coding models on Hugging Face, and Codenium, which recently nabbed $65 million series B funding at a valuation of $500 million.
熱點:克里斯