December 20th Big Model Daily Collection

News7months agorelease AIWindVane
43 0
December 20th Big Model Daily Collection

[December 20th Big Model Daily Collection] Swiftie sang “Daoxiang”, the domestic team’s Amphion audio generation became popular; Google Gemini: CMU comprehensive evaluation, Gemini Pro lost to GPT 3.5 Turbo; Big Model + search builds complete technology Stack, Baichuan Intelligence uses search enhancement to give enterprise customization a “strong medicine”; can video generation be infinitely long? Google VideoPoet large model is online, netizens: revolutionary technology; Microsoft Copilot adds new major features! Text directly generates ultra-realistic music; Intel Gaudi2C AI accelerator card appears in Linux driver, reportedly a special version for China

Swifty sings “Daoxiang”, and the domestic team’s Amphion audio generation becomes popular



The team of Associate Professor Wu Zhizhi of the School of Data Science at The Chinese University of Hong Kong (Shenzhen) and the OpenMMLab team of Shanghai Artificial Intelligence Laboratory have open sourced the comprehensive audio generation project Amphion. The system aims to create an open source platform that integrates speech synthesis and conversion, singing voice synthesis and conversion, sound effect music generation and other functions. So far, Amphion has entered the GitHub Trending Repositories list many times.

December 20th Big Model Daily Collection

Understand Google Gemini: CMU comprehensive evaluation, Gemini Pro loses to GPT 3.5 Turbo



Some time ago, Google released Gemini, a competitor to the OpenAI GPT model. This big model comes in three versions – Ultra (the most capable), Pro and Nano. Test results published by the research team show that the Ultra version outperforms GPT4 in many tasks, while the Pro version is on par with GPT-3.5. Although these comparative results are of great significance to large-scale language model research, the exact evaluation details and model predictions have not yet been made public, which limits the reproduction and detection of the test results, making it difficult to further analyze its implicit details. In order to understand the true strength of Gemini, researchers from Carnegie Mellon University and BerriAI conducted an in-depth exploration of the model’s language understanding and generation capabilities.

December 20th Big Model Daily Collection

NeurIPS 2023 Spotlight | Tencent AI Lab’s new breakthrough: Flexible strategies to deal with professional players in StarCraft 2



Recently, the game AI team of Tencent AI Lab announced the latest research progress of its decision-making intelligence AI “Juewu” in “StarCraft 2”, proposing an innovative training method that significantly improves the AI’s in-game strategic adaptability, making it In a fair battle environment that takes APM into consideration, we played up to 20 Protoss vs. Protoss games with three of the country’s top Protoss professional players, stably maintaining a winning rate of 50% and above. This result has been included in the NeurIPS 2023 Spotlight paper.

December 20th Big Model Daily Collection

Large model + search builds a complete technology stack. Baichuan Intelligence uses search enhancement to give enterprise customization a “strong medicine”



Baichuan Intelligence has officially opened the search-enhanced Baichuan2-Turbo series API, including Baichuan2-Turbo-192K and Baichuan2-Turbo. This series of APIs not only supports a 192K ultra-long context window, but also adds the ability to search the enhanced knowledge base. All users can upload specific text materials to build their own exclusive knowledge base, and build more complete and efficient intelligent solutions according to their own business needs. At the same time, Baichuan Intelligence has also upgraded the official website model experience, officially supporting PDF text upload and URL address input. Ordinary users can experience the soaring level of general intelligence after long context windows and search enhancements through the official website entrance.

December 20th Big Model Daily Collection

Can video generation be infinitely long? Google VideoPoet large model is online, netizens: revolutionary technology



At the end of 2023, technology companies are impacting the last level of generative AI-video generation. On Tuesday this week, the large video generation model proposed by Google went online and immediately attracted people’s attention. This large language model called VideoPoet is considered a revolutionary zero-shot video generation tool. VideoPoet can generate videos from text and images, as well as style transfer and video to speech. In effect, it can build diverse and smooth movements.

The flagship mobile phone with volume generative AI will trigger an interactive revolution in 2024



Recently, various mobile phone manufacturers have been doing one thing invariably: bringing generative AI to mobile phones. First, at the Snapdragon Summit in October, Xiaomi released news about a 6 billion parameter model that can run on the mobile phone side. The AI model they proposed is installed in the new generation of mobile phone systems, which can answer various complex questions asked by people, help you generate articles, tables, or help you write code. Honor has demonstrated in advance the generative AI capabilities of its next-generation flagship phone Magic 6. By issuing instructions through natural language on your phone, you can ask AI to find the material you shot and find the appropriate parts to integrate into a video. Then in November, manufacturers such as vivo and OPPO released the generative AI capabilities of their new generation flagship phones, and their mobile phone systems will also be deeply integrated with AI. Two weeks ago, Google Gemini, which claimed to surpass GPT-4, also added fuel to this trend.

December 20th Big Model Daily Collection

Microsoft Copilot adds new major features! Text directly generates ultra-realistic music



On December 20, Microsoft announced on its official website that it had cooperated with Vincent music leader Suno to integrate its functions into copilot. Users can generate rock, pop, classical, punk, folk and other types of music through text. The music generated by the Suno platform does not have a strong robotic flavor, and the effect is better than Google’s Lyria and Meta’s MusicGen, and is almost exactly the same as a real person singing. Regardless of whether you know musical instruments or can make music scores, now you only need to input your ideas into Microsoft Copilot using text to quickly generate them.

December 20th Big Model Daily Collection

Generate accurate image subtitles from text, open source PixelLLM from Google and others



Traditional large language models can describe, answer image-related questions, and even perform complex image reasoning. But using large language models for text localization, or using images to refer to exact coordinates is less feasible. To explore this technology, researchers at Google and the University of California, San Diego developed the Pixel-Aligned Large Language Model—PixelLLM. PixelLLM can take image location information as input or output. When given a location as input, the model can generate descriptive text related to a specified object or area based on the location. When generating locations as output, the model can generate pixel coordinates for each output word, enabling dense word localization.

December 20th Big Model Daily Collection

Baidu Lingjing Matrix is upgraded to an intelligent agent platform, and the era of intelligent agents can be developed by everyone.



Baidu’s “Spiritual Realm Matrix” platform has been newly upgraded to the “Wenxin Large Model Intelligent Platform”. Based on the Wenxin large model, Lingjing Matrix provides developers with diversified development methods and supports developers to choose diversified development methods based on their own industry fields and application scenarios to create native applications in the large model era. Lingjing Matrix also has the most complete intelligent agent ecosystem in China. It not only relies on the powerful Wenxin large model, but also has more than 30,000 developers applying to settle in. It can also rely on Baidu’s global scene to obtain more traffic distribution paths and business opportunities. . At present, there are many intelligent agents such as legal intelligent assistants, TreeMind tree diagrams, workplace password AI intelligent resumes, etc., running the path from development to distribution to monetization through the spiritual matrix.

December 20th Big Model Daily Collection

Baidu Smart Cloud Qianfan AppBuilder is an open service, allowing everyone to develop native applications



Baidu Smart Cloud announced that Qianfan AppBuilder, the AI native application development workbench, has fully opened its services, truly enabling everyone to develop their own AI native applications.

December 20th Big Model Daily Collection

Intel Gaudi2C AI accelerator card appears in Linux driver, reportedly a special version for China



In July this year, Intel launched a Gaudi2 processor for the Chinese market, which is mainly used to accelerate AI training and inference. There’s also a new accelerator card version coming, and Intel has added support for Gaudi2C in its Linux driver, Phoronix reports. It’s unclear what the difference is between Gaudi2C and Gaudi2, with reports suggesting it could be a “limited” variant that’s still exclusive to the Chinese market. Foreign media tomshardware also said that it may be a streamlined version of Gaudi2.

December 20th Big Model Daily Collection
© Copyright notes

Related posts

No comments

No comments...