Collection of Daily Large Models from October 28th to October 29th

News1years go (2023)release AIWindVane

198 0 0

[Collection of Daily Large Models from October 28th to October 29th] Google spends 2 billion US dollars on Anthropic: the large model arms race escalates; the best 7B model changes hands again! Defeat 70 billion LLaMA2, Apple computers can run｜Open source and free; better than Transformer, BERT and GPT without Attention and MLPs are actually stronger

Google spends $2 billion on Anthropic: Big model arms race escalates

Link: https://news.miracleplus.com/share_link/11252

The company has invested $500 million upfront in the key OpenAI competitor and has agreed to invest an additional $1.5 billion after that, the spokesman said. Google is already a significant investor in Anthropic, and the new investment will help Google step up its efforts to compete with Microsoft as big tech companies race to integrate artificial intelligence into their businesses. Anthropic is a generative AI startup co-founded in 2021 by Dario Amodei, the former vice president of research at OpenAI, Tom Brown, the first author of the GPT-3 paper, and others. It is headquartered in San Francisco, California. Most of the founding members of the company are core employees of OpenAI. They have been deeply involved in many research projects such as OpenAI’s GPT-3 and Reinforcement Learning with Human Preferences (RLHF).

The best 7B model changes hands again! Defeat 70 billion LLaMA2, Apple computers can run | Open source and free

Link: https://news.miracleplus.com/share_link/11253

A 7 billion parameter model that cost $500 to “tune” beats the 70 billion parameter Llama 2! And it can be run easily on a laptop, and the effect is comparable to ChatGPT. Key points: free, no money required. The open source model Zephyr-7B created by the HuggingFace H4 team makes its debut. The underlying model is Mistral-7B, a large open source model created by Mistral AI, which became popular some time ago and is known as “European OpenAI”. You know, less than 2 weeks after the release of Mistral-7B, various fine-tuned versions have appeared one after another, just like when Llama was first released, various “alpaca” trends quickly appeared. The key to Zephyr’s ability to stand out among the variants is that the team used Direct Preference Optimization (DPO) to fine-tune the model on public data sets based on Mistral. The team also found that removing the built-in alignment of the dataset can further improve MT Bench performance. The average MT-Bench score of the first generation Zephyr-7B-alpha is 7.09, surpassing Llama2-70B-Chat. Collection of Daily Large Models from October 28th to October 29th

Yuanchengxiang Chatimg3.0 is here, catching up with GPT-4V, and also provides new ways to upgrade the industry

Link: https://news.miracleplus.com/share_link/11254

At the CNCC 2023 “Super Intelligence Fusion AI Large Model Application Implementation Development Forum” held on October 28, Sophon Engine released “Yuancheng Xiang Chatimg3.0” and demonstrated the multi-modal general generation model “Yuancheng Xiang Chatimg3. 0″‘s latest progress and implementation exploration. Yuanchengxiang Chatimg3.0 is a large multi-modal model with ultra-fine recognition and less hallucinations. It also supports multi-image understanding, object positioning, OCR and other functions. Chatimg3.0 equips hardware devices with brains, enabling more natural and smooth human-machine communication, laying a solid foundation for AI multi-modal large model-empowered industrial applications. Compared with Chatimg2.0, Chatimg3.0 has mainly been upgraded in two aspects, including the first stage of pre-training (multi-task training such as description, detection, OCR, etc.) and the second stage of instruction fine-tuning (high-quality manual fine screening). Instruction Set).

How does multi-modal search algorithm make video search more accurate? Tencent’s exclusive reveal, super detailed

Link: https://news.miracleplus.com/share_link/11255

Video search is the largest horizontal vertical category in search, and video results will be displayed under about 50% of search terms. However, video resources are different from text web resources, which will bring new technical challenges in terms of video understanding, video matching and sorting, and interactive behavior. Multimodal technology has gradually come into people’s field of vision in recent years. Especially after the Transformer structure shines in the field of NLP, it has also extended to multimodal fields such as vision and audio, bringing greater convenience and possibility to cross-modal fusion. . Multi-modal pre-training (such as ViLBERT/VisualBERT/VL-BERT/ERNIE-ViL, etc.), multi-modal fusion technology (such as matrix-based, ordinary NN-based, attention-based, etc.), multi-modal alignment technology, contrastive learning technology ( The development of CLIP has also made it possible to rapidly improve the effectiveness of video search services. As a tool that serves tens of millions of people every day, the search function of Tencent QQ Browser plays an important role. With the trend of video production/consumption in the past few years, people are also getting used to consuming and searching for videos.

Better than Transformer, BERT and GPT without Attention and MLPs are actually stronger.

Link: https://news.miracleplus.com/share_link/11256

https://mp.weixin.qq.com/s/rjW-0pMCKWp-SNjgFJEfmw
From language models such as BERT, GPT, and Flan-T5 to image models such as SAM and Stable Diffusion, Transformers are sweeping the world with unstoppable momentum, but people can’t help but ask: Is Transformer the only option? A team of researchers from Stanford University and the State University of New York at Buffalo not only provides a negative answer to this question, but also proposes a new alternative technology: Monarch Mixer. Recently, the team published relevant papers and some checkpoint models and training codes on arXiv. By the way, this paper has been selected for NeurIPS 2023 and qualified for Oral Presentation.

Multimodal LLM hallucination problem reduced by 30%! The industry’s first “Woodpecker” weight-free training method was born | University of Science and Technology of China

Link: https://news.miracleplus.com/share_link/11257

Are you still using instruction fine-tuning to solve the “illusion” problem of large multi-modal models? A study by the University of Science and Technology of China came up with a brand-new approach: a universal architecture that requires no re-training and is plug-and-play. It starts directly from the error text given by the model, “backwards” to figure out where “hallucinations” may occur, and then compares it with the picture. Determine the facts and ultimately complete the correction directly. They named this method “Woodpecker”. Just like the so-called “forest doctor” first finds the wormholes in trees and then eats the bugs inside, the “woodpecker” proposed in this article is also the “illusion” doctor of multi-modal large models, able to solve problems first. Diagnose and then correct one by one.

Peking University team: All it takes to induce the “hallucination” of a large model is a string of garbled characters! All big and small alpacas are recruited

Link: https://news.miracleplus.com/share_link/11258

The latest research from the Peking University team found that random tokens can induce hallucinations in large models! Similar situations will occur with popular large models such as Baichuan2-7B, InternLM-7B, ChatGLM, Ziya-LLaMA-7B, LLaMA-7B-chat, and Vicuna-7B. This means that random strings can control large models to output arbitrary content and “endorse” illusions. The above findings come from the latest research by the research group of Professor Yuan Li of Peking University. This study proposes that the hallucination phenomenon of large models is very likely to be another perspective of adversarial examples. While demonstrating two methods that easily induce large model hallucinations, the paper also proposes a simple and effective defense method, and the code has been open source.

HyperHuman, a more high-definition and realistic human body generation model, is here, based on implicit structure diffusion, refreshing many SOTAs

Link: https://news.miracleplus.com/share_link/11259

In order to introduce structural control information into the Vincent graph, recent representative works such as ControlNet ([1]) and T2I-Adapter ([2]) add very lightweight, plug-and-play learnable branches to adjust the predetermined Trained Vincent graph diffusion model. However, the feature gap between the original diffusion model branch and the newly added learnable branch often leads to inconsistency between the generation results and the control signal. In order to solve this problem, HumanSD ([3] ) uses a native control guidance method, which directly splices the human skeleton map and the diffusion model input in the feature dimension. In this article, the Snap Research Institute, the Chinese University of Hong Kong, the University of Hong Kong, and the Nanyang Technological University team launched the latest high-realistic human body generation model HyperHuman, which jointly learns the explicit human appearance and the implicit multi-level human structure. The zero-shot MS-COCO data set achieved the best image quality (FID, FID_CLIP, KID) and generation-human pose consistency (AP, AR) index results, and obtained excellent text-image alignment index ( CLIP score) results, and achieved optimal results in a wide range of user subjective evaluations.

How can small models compete with large models? BIT releases MindLLM, a large model of Minde. Small models have huge potential.

Link: https://news.miracleplus.com/share_link/11260

The natural language processing team of Beijing Institute of Technology released a series of bilingual lightweight large language models, Ming De LLM – MindLLM, which comprehensively introduces the experience accumulated in the development of large models, covering data construction, model architecture, evaluation and application. Every detailed step of the process. MindLLM, trained from scratch and available in versions 1.3B and 3B, consistently matches or exceeds the performance of other open source large models in some public benchmarks. MindLLM also introduces an innovative instruction adjustment framework tailored for small models to effectively enhance its capabilities. In addition, MindLLM also has excellent domain adaptability when applied in specific vertical fields such as law and finance.

A team behind Stable Diffusion wants to open source emotion-detecting AI

Link: https://news.miracleplus.com/share_link/11275

In 2019, Amazon upgraded its Alexa assistant, adding a feature that allows it to detect when a customer might be frustrated and respond with more empathy accordingly. For example, if a customer asks Alexa to play a song and it schedules the wrong song, and then the customer says “No, Alexa” in a frustrated tone, Alexa may apologize and ask for clarification. Now, the team behind a dataset used to train text-to-image model Stable Diffusion wants to make similar emotion detection capabilities available to every developer—for free.

Generated molecules are nearly 100% effective in guided diffusion models for reverse molecular design

Link: https://news.miracleplus.com/share_link/11262

“De novo molecular design” is the “holy grail” of materials science. The introduction of generative deep learning has greatly advanced this direction, but molecular discovery remains challenging and often inefficient. A research team from the Technion-Israel Institute of Technology and the University Ca’ Foscari of Venice in Italy proposed a guided diffusion model for reverse molecular design: GaUDI, which combines Equivariant graph neural networks and generative diffusion models. The researchers demonstrated the effectiveness of GaUDI in designing molecules for organic electronic applications by applying single- and multi-objective tasks to a generated dataset of 475,000 polycyclic aromatic systems. GaUDI demonstrates improved conditional design, generating molecules with optimal properties, even going beyond the original distribution, proposing molecules that are better than those in the data set. In addition to point-by-point targets, GaUDI can also be guided to open-ended targets such as minima or maxima, and in all cases the effectiveness of the generated molecules is close to 100%.

Provide ChatGPT services to 30,000 employees! One of Asia’s largest banks partners with Microsoft

Link: https://news.miracleplus.com/share_link/11263

According to Asian Financial Briefing, Oversea-Chinese Banking Corporation (OCBC), one of Asia’s largest banks, will provide OCBC ChatGPT services to 30,000 employees around the world starting in November, including its wholly-owned subsidiary BoS (one of Asia’s largest private banks). As early as April this year, OCBC entered into technical cooperation with Microsoft Azure OpenAI, and combined its massive financial data with fine-tuning to create a ChatGPT assistant in the banking field, which can be used for text generation, content summary, drafting emails, translating content, writing investment reports, etc. . After 6 months of joint testing with more than 1,000 employees and multiple core departments such as investment, management, marketing, and operations. Now, OCBC ChatGPT will officially serve as a daily tool to provide OCBC with safe and reliable generative AI services.

# News # Related articles

文章版权归作者所有，未经允许请勿转载。

Big Model Daily on November 10

AIWindVane

161 0

July 29th Big Model Daily

AIWindVane

407 0

March 4th Large Model Daily Collection

AIWindVane

156 0

November 7 Big Model Daily Collection

AIWindVane

165 0

Collection of large model daily reports on October 27

AIWindVane

147 0

Collection of big model daily reports on October 26

AIWindVane

126 0

No comments

No comments...

Collection of Daily Large Models from October 28th to October 29th

Google spends $2 billion on Anthropic: Big model arms race escalates

The best 7B model changes hands again! Defeat 70 billion LLaMA2, Apple computers can run | Open source and free

Yuanchengxiang Chatimg3.0 is here, catching up with GPT-4V, and also provides new ways to upgrade the industry

How does multi-modal search algorithm make video search more accurate? Tencent’s exclusive reveal, super detailed

Better than Transformer, BERT and GPT without Attention and MLPs are actually stronger.

Multimodal LLM hallucination problem reduced by 30%! The industry’s first “Woodpecker” weight-free training method was born | University of Science and Technology of China

Peking University team: All it takes to induce the “hallucination” of a large model is a string of garbled characters! All big and small alpacas are recruited

HyperHuman, a more high-definition and realistic human body generation model, is here, based on implicit structure diffusion, refreshing many SOTAs

How can small models compete with large models? BIT releases MindLLM, a large model of Minde. Small models have huge potential.

A team behind Stable Diffusion wants to open source emotion-detecting AI

Generated molecules are nearly 100% effective in guided diffusion models for reverse molecular design

Provide ChatGPT services to 30,000 employees! One of Asia’s largest banks partners with Microsoft

Collection of large model daily reports on October 27

Collection of Big Model Daily on October 30

Related posts

No comments