December 11 Big Model Daily Collection

News1years go (2023)release AIWindVane

153 0 0

[December 11 Big Model Daily Collection] Exaggeration! Nearly 5,000 articles were submitted to EMNLP, and awards were announced: Peking University and Tencent won the best long papers; NVIDIA CEO Jensen Huang visited Vietnam and planned to build a chip production center in the country; Zero-One Thousand Things Yi-34B-Chat fine-tuning model was launched and landed on multiple authoritative lists one

exaggerate! EMNLP submitted nearly 5,000 articles, and awards were announced: Peking University and Tencent won the best long paper

Link: https://news.miracleplus.com/share_link/12884

EMNLP is one of the top conferences in the field of natural language processing, and EMNLP 2023 will be held in Singapore from December 6th to 10th. Because the popularity of ChatGPT this year has driven the concept of large models and NLP, the number of submitted papers for EMNLP 2023 has reached nearly 5,000, even slightly higher than ACL 2023. In terms of acceptance rate, the acceptance rate for long papers was 23.3%, the acceptance rate for short papers was 14%, and the overall acceptance rate was 21.3%. This figure is a slight improvement from the 20% in EMNLP 2022. EMNLP 2023 awarded one each for the Best Long Paper, Best Short Paper, Best Theme Paper, Best Demo Paper and Best Industry Paper, as well as multiple outstanding papers from different tracks.

Diffusion model that loses attention: Mamba’s popular SSM is targeted by Apple and Cornell

Link: https://news.miracleplus.com/share_link/12885

Thanks to the release of “Mamba” last week, the state space model SSM is receiving more and more attention. The core of Mamba is the introduction of a new architecture – “selective state space model”, which makes Mamba comparable to or even defeats Transformer in language modeling. At the time, paper author Albert Gu said that Mamba’s success gave him confidence in the future of SSM. Now, this paper from Cornell University and Apple seems to have added new examples of the application prospects of SSM.

Mixing multi-skilled large models like cocktails, Zhiyuan and other institutions released LM-Cocktail model management strategy

Link: https://news.miracleplus.com/share_link/12886

Recently, the Information Retrieval and Knowledge Computing Group of Intellectual Property Research Institute released the LM-Cocktail model governance strategy, which aims to provide large model developers with a low-cost way to continuously improve model performance: calculate fusion weights through a small number of samples, and use model fusion technology Integrate the advantages of the fine-tuned model and the original model to achieve efficient utilization of “model resources”.

OpenAI COO Brad Lightcap said that AI commercialization is overrated, we are still in a very early stage, and the most important parts have not yet been created

Link: https://news.miracleplus.com/share_link/12887

Brad Lightcap on CNBC after the OpenAI “coup” incident. As Lightcap recalls, OpenAI had limited GPU and processing power and saw itself primarily as a company building tools for developers and enterprises. He recalls that company CEO Sam Altman was a major proponent of Try Release, saying text-based interactions with models have important and personal meaning.

Nvidia CEO Jensen Huang visits Vietnam and plans to build a chip production center in the country

Link: https://news.miracleplus.com/share_link/12888

Vietnamese Prime Minister Pham Minh Zheng met with visiting Nvidia CEO Jensen Huang on the 10th local time. Huang Renxun said that Nvidia has invested approximately US$250 million in Vietnam and regards the country as an important market.

Li Auto: OTA 5.0 Li Auto introduces Mind GPT to support freedom of command

Link: https://news.miracleplus.com/share_link/12889

At today’s Li Auto intelligent software launch conference, Li Auto introduced various upgrades of OTA 5.0, which can be divided into three aspects: intelligent driving, intelligent space, and intelligent range extension. In terms of smart space, Li Auto said that the biggest change of Li Auto is the introduction of Mind GPT capabilities.

01Wanyi Yi-34B-Chat fine-tuned model is online and listed on multiple authoritative lists

Link: https://news.miracleplus.com/share_link/12891

Recently, many large model benchmarks in the industry have received another round of “strength value” updates. Following the release of the Yi-34B base model by Zero One Thing in early November, the Yi-34B-Chat fine-tuning model was open sourced and launched on November 24. In a short time, it landed on many authoritative lists of English and Chinese large models around the world, and once again won the Attention of global developers. Among them, on AlpacaEval, the large language model evaluation benchmark proposed by Stanford University, Yi-34B-Chat surpassed LLaMA2 Chat 70B, Claude 2, and ChatGPT with a winning rate of 94.08%, becoming the second largest model in the industry in the Alpaca certified model category. GPT-4 is a large language model with English proficiency and is one of the few open source models officially certified by Alpaca.

Zhipu AI releases text quality evaluation model CritiqueLLM

Link: https://news.miracleplus.com/share_link/12890

Zhipu AI recently proposed an interpretable and scalable text quality evaluation model CritiqueLLM. This model can provide high-quality evaluation scores and evaluation explanations for the generation results of large models on various instruction following tasks to solve how to solve the problem of how to perform research and development in R&D. In the process, the problem is to evaluate the model performance quickly, effectively, fairly and low-cost.

Japan’s Rakuten Group plans to launch its own large-scale language model within the next two months

Link: https://news.miracleplus.com/share_link/12892

Japan’s Rakuten Group plans to launch its own artificial intelligence language model within the next two months, CEO Hiroshi Mikitani said in an interview on Monday. Now, the fintech and e-commerce giant is looking to join other tech companies in developing this fast-growing technology.