Big Model Daily, March 2-3
[Big Model Daily, March 2-3] RNN efficiency is comparable to Transformer, Google’s new architecture has been launched twice: the same scale is stronger than Mamba; Dark Side of the Moon Yang Zhilin interview: AI will not find PMF in the next one or two years, but next How will the world change in ten to twenty years; AGI will appear in ten years? Can the next generation Gemini sense the environment? DeepMind CEO Hassabis talks about AI
Model preference only related to size? Shanghai Jiao Tong University comprehensively analyzes the quantitative components of human preferences and 32 large-scale models
https://news.miracleplus.com/share_link/20052
In the current model training paradigm, the acquisition and use of preference data has become an indispensable part. In training, preference data is usually used as the training optimization target during alignment, such as reinforcement learning based on human or AI feedback (RLHF/RLAIF) or direct preference optimization (DPO), while in model evaluation, due to the task Since there is usually no standard answer due to the complexity of the problem, the preference annotations of human annotators or high-performance large models (LLM-as-a-Judge) are usually directly used as the judging criteria. Although the above-mentioned applications of preference data have achieved widespread results, there is a lack of sufficient research on preferences themselves, which has largely hindered the construction of more trustworthy AI systems. To this end, the Generative Artificial Intelligence Laboratory (GAIR) of Shanghai Jiao Tong University released a new research result, which systematically and comprehensively analyzed the preferences displayed by human users and up to 32 popular large language models to Learn how preference data from different sources is quantitatively composed of various predefined attributes such as harmlessness, humor, acknowledgment of limitations, etc.
RNN efficiency is comparable to Transformer, Google’s new architecture has two consecutive releases: it is stronger than Mamba at the same scale
https://news.miracleplus.com/share_link/20053
This time, Google DeepMind has made new moves in terms of basic models. In a recent paper by Google DeepMind, researchers proposed the RG-LRU layer, which is a novel gated linear recurrent layer, and designed a new recurrent block around it to replace multi-query attention (MQA) . They used this recurrent block to build two new models, one was a model Hawk that was a hybrid of MLP and recurrent blocks, and the other was a model Griffin that was a hybrid of MLP with recurrent blocks and local attention.
Unified video editing framework: Zhejiang University & Microsoft launch UniEdit, which requires no training and supports multiple editing scenarios
https://news.miracleplus.com/share_link/20054
With the popularity of Sora, people have seen the huge potential of AI video generation and are paying more and more attention to this field. In addition to video generation, how to edit videos is also an important issue in real life, and its application scenarios are wider. In the past, video editing methods were often limited to editing at the “appearance” level, such as “style transfer” to the video or replacing objects in the video. However, there have been few attempts to change the “action” of objects in the video. In this article, researchers from Zhejiang University, Microsoft Research Asia, and Peking University propose a unified video editing framework UniEdit based on text description, which not only covers traditional appearance editing such as style transfer, background replacement, rigid/non-rigid object replacement, etc. Scene, you can also effectively edit the movements of objects in the video, such as changing the action of the raccoon playing the guitar in the above video into “eating an apple” or “waving”. In addition, in addition to the flexible natural language interface and unified editing framework, another major advantage of this model is that it does not require training, which greatly improves the convenience of deployment and user convenience.
Dark Side of the Moon Yang Zhilin Interview: AI is not about finding PMF in the next one or two years, but how it will change the world in the next ten to twenty years
https://news.miracleplus.com/share_link/20055
Just a year ago, AI scientist Yang Zhilin made a precise calculation in Silicon Valley. He realized that if he decided to launch a large-scale model startup targeting AGI, he would need to raise more than $100 million in capital immediately in the next few months. However, this is just a ticket. A year later, that number had tripled. The competition among large model companies is not so much a scientific competition as it is first and foremost a cruel financial struggle. With the capital side tightening their pockets, you have to find more money ahead of your opponents, buy more cards, and rob more talents. “It requires talent gathering and capital gathering.” said Yang Zhilin, founder and CEO of Moonshot AI, a large model company established on March 1, 2023. Yang Zhilin likes to think of his company as building a system that combines science, engineering and business. You can imagine that he wants to set up an AI experimental platform above the human world, conduct experiments with one hand, and put cutting-edge technology into the real world with the other, find application opportunities through interaction with humans, and then deliver the applications to consumers. The ideal situation is that the former burns billions or tens of billions of capital; the latter earns back the money hundreds or thousands of times – no matter how you sound, it is as thrilling as “walking on a tightrope”. “AI is not about what PMF I will find in the next one or two years, but how it will change the world in the next ten to twenty years,” he said.
Will AGI appear within ten years? Can the next generation Gemini sense the environment? DeepMind CEO Hassabis talks about AI
https://news.miracleplus.com/share_link/20056
“I wouldn’t be surprised if we have AGI-like systems in the next ten years.” Demis Hassabis, co-founder and CEO of Google DeepMind, said recently on the artificial intelligence podcast Dwarkesh Podcast. During the hour-long show, Hassabis shared his thoughts on the nature of intelligence, reinforcement learning, scaling and alignment, AGI, multimodality and other topics.
NVIDIA Jensen Huang: AI will pass any test in five years
https://news.miracleplus.com/share_link/20057
On Friday (March 1), Eastern Time, Nvidia CEO Jensen Huang said at the Stanford Institute for Economic Policy Summit in California that he expects general artificial intelligence (AGI) to be available as soon as five years. General artificial intelligence (AGI), also known as “strong artificial intelligence”, is a theoretical form of artificial intelligence, which refers to artificial intelligence that can learn and reason like humans, and has the potential to solve complex problems and make decisions independently. Since there is still no generally accepted definition of human intelligence, scientists in different fields have different definitions and standards for general artificial intelligence.
MWC2024, let’s see how outrageous the manufacturer’s new products can be!
https://news.miracleplus.com/share_link/20058
The Mobile World Congress (MWC) held in Barcelona has always been the place to showcase the latest and most advanced mobile technologies to the world. The theme of this conference is “VELOCITY”. Operators, mobile phone manufacturers and technology manufacturers from all over the world actively participate in the exhibition. Artificial intelligence dominates the release of most major innovative products, and the focus of the conference is on the development of AI and smartphones. combine. Many smartphone manufacturers point out that the artificial intelligence running on the device improves the security of the device, unlocks new applications, and is faster because the data processing is done on the phone. In this regard, CCS Insight chief analyst Ben Wood also said, “I think a big news at MWC will be the ability of large AI models to run on the device itself, which will very likely change the rules of the game.”